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A polypeptide has first and second domains which enable the polypeptide to be translocated into a target cell or which increase the 
solubility of the polypeptide, or both, and further enable the polypeptide to cleave one or more vesicle or plasma-membrane associated 

toxin, without the toxicity associated with the natural molecule. The polypeptide can also contain a third domain that targets it to a specific 
cell, rendering the polypeptide useful in inhibition of exocytosis in target cells. Fusion proteins comprising the polypeptide, nucleic acids 
encoding the polypeptide and methods of making the polypeptide are also provided. Controlled activation of the polypeptide is possible 
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RECOMBINANT TOXIN FRAGMENTS 

This invention relates to recombinant toxin fragments, to DNA encoding these 
fragments and to their uses such as in a vaccine and for in vitro and in vivo 
purposes. 

The clostridial neurotoxins are potent inhibitors of calcium-dependent 
neurotransmitter secretion in neuronal cells. They are currently considered to 
mediate this activity through a specific endoproteolytic cleavage of at least one of 
three vesicle or pre-synaptic membrane associated proteins VAMP, syntaxin or 
SNAP-25 which are central to the vesicle docking and membrane fusion events of 
neurotransmitter secretion. The neuronal cell targeting of tetanus and botulinum 
neurotoxins is considered to be a receptor mediated event following which the 
toxins become internalised and subsequently traffic to the appropriate intracellular 
compartment where they effect their endopeptidase activity. 

The clostridial neurotoxins share a common architecture of a catalytic L-chain (LC, 
ca 50 kDa) disulphide linked to a receptor binding and translocating H-chain (HC, 
ca 100 kDa), The HC polypeptide is considered to comprise all or part of two 
Hictinnt funrtinnai domains. The carboxv-terminal half of the HC (ca 50 kDa), 
termed the H c domain, is involved in the high affinity, neurospecific binding of the 
neurotoxin to cell surface receptors on the target neuron, whilst the amino-terminal 
half, termed the H N domain (ca 50 kDa), is considered to mediate the translocation 
of at least some portion of the neurotoxin across cellular membranes such that the 
functional activity of the LC is expressed within the target cell. The H N domain also 
has the property, under conditions of low pH, of forming ion-permeable channels 
in lipid membranes, this may in some manner relate to its translocation function. 

For botulinum neurotoxin type A (BoNT/A) these domains are considered to reside 
within amino acid residues 872-1 296 for the H c , amino acid residues 449-871 for 
the H N and residues 1-448 for the LC. Digestion with trypsin effectively degrades 
the H c domain of the BoNT/A to generate a non-toxic fragment designated LH N , 
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which is no longer able to bind to and enter neurons (Fig. 1). The LH N fragment so 
produced also has the property of enhanced solubility compared to both the parent 
holotoxtn and the isolated LC. 

It is therefore possible to provide functional definitions of the domains within the 
neurotoxin molecule, as follows: 

(A) clostridial neurotoxin light chain: 

-a metalloprotease exhibiting high substrate specificity for vesicle and/or plasma - 
membrane associated proteins involved in the exocytotic process. In particular, it 
cleaves one or more of SNAP-25, VAMP (synaptobrevin / cellubrevin) and syntaxin. 

(B) clostridial neurotoxin heavy chain H N domain: 

-a portion of the heavy chain which enables translocation of that portion of the 
neurotoxin molecule such that a functional expression of light chain activity occurs 
within a target cell. 

-the domain responsible for translocation of the endopeptidase activity, following 
binding of neurotoxin to its specific cell surface receptor via the binding domain, 
into the target cell. 

-the domain responsible for formation of ion-permeable pores in lipid membranes 
under conditions of low pH. 

-the domain responsible for increasing the solubility of the entire polypeptide 
compared to the solubility of light chain alone. 

(C) clostridial neurotoxin heavy chain H c domain. 



-a portion of the heavy chain which is responsible for binding of the native 
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holotoxin to eel! surface receptor(s) involved in the intoxicating action of clostridial 
toxin prior to internalisation of the toxin into the cell. 

The identity of the cellular recognition markers for these toxins is currently not 
understood and no specific receptor species have yet been identified although 
Kozaki et al. have reported that synaptotagmin may be the receptor for botulinum 
neurotoxin type B. It is probable that each of the neurotoxins has a different 
receptor. 

It is desirable to have positive controls for toxin assays, to develop clostridial toxin 
vaccines and to develop therapeutic agents incorporating desirable properties of 
clostridial toxin. 

However, due to its extreme toxicity, the handling of native toxin is hazardous. 

The present invention seeks to overcome or at least ameliorate problems associated 
with production and handling of clostridial toxin. 

Accordingly, the invention provides a polypeptide comprising first and second 

_ ^ ....... ^ , ------ — - • — — - ■ — " — IV MWMfrftWXd wiiw V> I IMUIb V OOIUI^ Ul 

plasma-membrane associated proteins essential to neuronal exocytosis and wherein 
said second domain is adapted (i) to translocate the polypeptide into the cell or (ii) 
to increase the solubility of the polypeptide compared to the solubility of the first 
domain on its own or (iii) both to translocate the polypeptide into the cell and to 
increase the solubility of the polypeptide compared to the solubility of the first 
domain on its own, said polypeptide being free of clostridial neurotoxin and free of 
any clostridial neurotoxin precursor that can be converted into toxin by proteolytic 
action. Accordingly, the invention may thus provide a single polypeptide chain 
containing a domain equivalent to a clostridial toxin light chain and a domain 
providing the functional aspects of the H N of a clostridial toxin heavy chain, whilst 
lacking the functional aspects of a clostridial toxin H c domain. 
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For the purposes of the invention, the functional property or properties of the H N 
of a clostridial toxin heavy chain that are required to be exhibited by the second 
domain of the polypeptide of the invention are either (i) translocation of the 
polypeptide into a cell, or (ii) increasing solubility of the polypeptide compared to 
solubility of the first domain on its own or (iii) both (i) and (ii). References hereafter 
to a H N domain or to the functions of a H N domain are references to this property 
or propertfes. The second domain is not required to exhibit other properties of the 
H N domain of a clostridial toxin heavy chain. 

A polypeptide of the invention can thus be soluble but lack the translocation 
function of a native toxin-this is of use in providing an immunogen for vaccinating 
or assisting to vaccinate an individual against challenge by toxin. In a specific 
embodiment of the invention described in an example below a polypeptide 
designated LH 423 /A elicited neutralising antibodies against type A neurotoxin. A 
polypeptide of the invention can likewise thus be relatively insoluble but retain the 
translocation function of a native toxin - this is of use if solubility is imparted to a 
composition made up of that polypeptide and one or more other components by 
one or more of said other components. 

The first domain of the polypeptide of the invention cleaves one or more vesicle or 
plasma-membrane associated proteins essential to the specific cellular process of 
exocytosis, and cleavage of these proteins results in inhibition of exocytosis, 
typically in a non-cytotoxic manner. The cell or cells affected are not restricted to 
a particular type or subgroup but can include both neuronal and non-neuronal cells. 
The activity of clostridial neurotoxins in inhibiting exocytosis has^ indeed, been 
observed almost universally in eukaryotic cells expressing a relevant cell surface 
receptor, including such diverse cells as from Aplysia (sea slug), Drosophila (fruit 
fly) and mammalian nerve cells, and the activity of the first domain is to be 
understood as including a corresponding range of cells. 

The polypeptide of the invention may be obtained by expression of a recombinant 
nucleic acid, preferably a DNA, and is a single polypeptide, that is to say not 



WO 98/07864 PCT/GB9 7/022 73 

- 5 - 

cleaved into separate light and heavy chain domains. The polypeptide is thus 
available in convenient and large quantities using recombinant techniques. 

In a polypeptide according to the invention, said first domain preferably comprises 
a clostridial toxin light chain or a fragment or variant of a clostridial toxin light 
chain. The fragment is optionally an N-terminal, or C-terminal fragment of the light 
chain, or is an internal fragment, so long as it substantially retains the ability to 
cleave the vesicle or plasma-membrane associated protein essential to exocytosis. 
The minimal domains necessary for the activity of the light chain of clostridial 
toxins are described in J. Biol. Chem., Vol.267, No. 21, July 1992, pages 14721- 
14729. The variant has a different peptide sequence from the light chain or from 
the fragment, though it too is capable of cleaving the vesicle or plasma-membrane 
associated protein. It is conveniently obtained by insertion, deletion and/or 
substitution of a light chain or fragment thereof. In embodiments of the invention 
described below a variant sequence comprises (i) an N-terminal extension to a 
clostridial toxin light chain or fragment (ii) a clostridial toxin light chain or fragment 
modified by alteration of at least one amino acid (iii) a C-terminal extension to a 
clostridial toxin light chain or fragment, or (iv) combinations of 2 or more of (i)-(iii). 

I l. .~+U — , 1 -J 1 . ~ i : _ . • • . 

- ■ ■ ■ • ' — - — • ' " . » w. i *. w i • , tiiw 'ui <ui it wwiibuiii^ aii giuillU QUIU 

sequence modified so that (a) there is no protease sensitive region between the LC 
and H N components of the polypeptide, or (b) the protease sensitive region is 
specific for a particular protease. This latter embodiment is of use if it is desired 
to activate the endopeptidase activity of the light chain in a particular environment 
or cell. Though, in general, the polypeptides of the invention are activated prior to 
administration. 

The first domain preferably exhibits endopeptidase activity specific for a substrate 
selected from one or more of SNAP-25, synaptobrevin/VAMP and syntaxin. The 
clostridial toxin is preferably botulinum toxin or tetanus toxin. 



In an embodiment of the invention described in an example below, the toxin light 
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chain and the portion of the toxin heavy chain are of botulinum toxin type A. In 
a further embodiment of the invention described in an example below, the toxin 
light chain and the portion of the toxin heavy chain are of botulinum toxin type B. 
The polypeptide optionally comprises a light chain or fragment or variant of one 
toxin type and a heavy chain or fragment or variant of another toxin type. 

In a polypeptide according to the invention said second domain preferably 
comprises a clostridial toxin heavy chain H N portion or a fragment or variant of a 
clostridial toxin heavy chain H N portion. The fragment is optionally an N-terminal 
or C-terminal or internal fragment, so long as it retains the function of the H N 
domain. Teachings of regions within the H N responsible for its function are 
provided for example in Biochemistry 1 995, 34, pages 1 5175-1 5181 and Eur. J. 
Biochem, 1 989, 1 85, pages 1 97-203. The variant has a different sequence from 
the H N domain or fragment, though it too retains the function of the H N domain. 
It is conveniently obtained by insertion, deletion and/or substitution of a H N domain 
or fragment thereof. In embodiments of the invention, described below, it 
comprises (i) an N-terminal extension to a H N domain or fragment, (ii) a C-terminal 
extension to a H N domain or fragment, (iii) a modification to a H N domain or 
fragment by alteration of at least one amino acid, or (iv) combinations of 2 or more 
of (i)-(iii). The clostridial toxin is preferably botulinum toxin or tetanus toxin. 

The invention also provides a polypeptide comprising a clostridial neurotoxin light 
chain and a N-terminal fragment of a clostridial neurotoxin heavy chain, the 
fragment preferably comprising at least 423 of the N-terminal amino acids of the 
heavy chain of botulinum toxin type A, 417 of the N-terminal amino acids of the 
heavy chain of botulinum toxin type B or the equivalent number of N-terminal 
amino acids of the heavy chain of other types of clostridial toxin such that the 
fragment possesses an equivalent alignment of homologous amino acid residues. 

These polypeptides of the invention are thus not composed of two or more 
polypeptides, linked for example by di-sulphide bridges into composite molecules. 
Instead, these polypeptides are single chains and are not active or their activity is 
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significantly reduced in an in vitro assay of neurotoxin endopeptidase activity 

Further, the polypeptides may be susceptible to be converted into a form exhibiting 
endopeptidase activity by the action of a proteolytic agent, such as trypsin. In this 
way it is possible to control the endopeptidase activity of the toxin light chain. 

In a specific embodiment of the invention described in an example below, there is 
provided a polypeptide lacking a portion designated H c of a clostridial toxin heavy 
chain. This portion, seen in the naturally produced toxin, is responsible for binding 
of toxin to cell surface receptors prior to internalisation of the toxin. This specific 
embodiment is therefore adapted so that it can not be converted into active toxin, 
for example by the action of a proteolytic enzyme. The invention thus also 
provides a polypeptide comprising a clostridial toxin light chain and a fragment of 
a clostridial toxin heavy chain, said fragment being not capable of binding to those 
cell surface receptors involved in the intoxicating action of clostridial toxin, and it 
is preferred that such a polypeptide lacks an intact portion designated H c of a 
clostridial toxin heavy chain. 

In further embodiments of the invention there are provided compositions containing 



of a clostridial toxin heavy chain, and wherein the composition is free of clostridial 
toxin and free of any clostridial toxin precursor that may be converted into 
clostridial toxin by the action of a proteolytic enzyme. Examples of these 
compositions include those containing toxin light chain and H N sequences of 
botulinum toxin types A, B, C,, D, E, F and G, 



The polypeptides of the invention are conveniently adapted to bind to, or include, 
a ligand for targeting to desired ceils. The polypeptide optionally comprises a 
sequence that binds to, for example, an immunoglobulin. A suitable sequence is 
a tandem repeat synthetic IgG binding domain derived from domain B of 
Staphylococcal protein A. Choice of immunoglobulin specificity then determines 
the target for a polypeptide - immunoglobulin complex. Alternatively, the 
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polypeptide comprises a non-clostridial sequence that binds to a cell surface 
receptor, suitable sequences including insulin-like growth factor-! (IGF-1) which 
binds to its specific receptor on particular cell types and the 14 amino acid residue 
sequence from the carboxy-terminus of cholera toxin A subunit which is able to 
bind the cholera toxin B subunit and thence to GM1 gangliosides. A polypeptide 
according to the invention thus, optionally, further comprises a third domain 
adapted for binding of the polypeptide to a cell. 

In a second aspect the invention provides a fusion protein comprising a fusion of 
(a) a polypeptide of the invention as described above with (b) a second polypeptide 
adapted for binding to a chromatography matrix so as to enable purification of the 
fusion protein using said chromatography matrix. It is convenient for the second 
polypeptide to be adapted to bind to an affinity matrix, such as a glutathione 
Sepharose, enabling rapid separation and purification of the fusion protein from an 
impure source, such as a cell extract or supernatant. 

One possible second purification polypeptide is glutathione-S-transferase (GST), 
and others will be apparent to a person of skill in the art, being chosen so as to 
enable purification on a chromatography column according to conventional 
techniques. 

As noted above, by proteolytic treatment, for example using trypsin, of a 
polypeptide of the invention it is possible to induce endopeptidase activity in the 
treated polypeptide. A third aspect of the invention provides a composition 
comprising a derivative of a clostridial toxin, said derivative retaining at least 10% 
of the endopeptidase activity of the clostridial toxin, said derivative further being 
non-toxic in vivo due to its inability to bind to cell surface receptors, and wherein 
the composition is free of any component, such as toxin or a further toxin 
derivative, that is toxic in vivo . The activity of the derivative preferably approaches 
that of natural toxin, and is thus preferably at least 30% and most preferably at 
least 60% of natural toxin. The overall endopeptidase activity of the composition 
will, of course, also be determined by the amount of the derivative that is present. 
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While it is known to treat naturally produced clostridial toxin to remove the H 
domain, this treatment does not totally remove toxicity of the preparation, instead 
some residual toxin activity remains. Natural toxin treated in this way is therefore 
still not entirely safe. The composition of the invention, derived by treatment of 
a pure source of polypeptide advantageously is free of toxicity, and can 
conveniently be used as a positive control in a toxin assay, as a vaccine against 
clostridial toxin or for other purposes where it is essential that there is no residual 
toxicity in the composition. 

The invention enables production of the polypeptides and fusion proteins of the 
invention by recombinant means. 



A fourth aspect of the invention provides a nucleic acid encoding a polypeptide or 
a fusion protein according to any of the aspects of the invention described above. 

In one embodiment of this aspect of the invention, a DNA sequence provided to 
code for the polypeptide or fusion protein is not derived from native clostridial 
sequences, but is an artificially derived sequence not preexisting in nature. 

" "i"" (>jcu ilj twj. n ae5cnoea in more detail below encodes a 

polypeptide or a fusion protein comprising nucleotides encoding residues 1-871 of 
a botulinum toxin type A. Said polypeptide comprises the light chain domain and 
the first 423 amino acid residues of the amino terminal portion of a botulinum toxin 
type A heavy chain. This recombinant product is designated LH 423 /A (SEQ ID NO: 
2). 



In a second embodiment of this aspect of the invention a DNA sequence which 
codes for the polypeptide or fusion protein is derived from native clostridial 
sequences but codes for a polypeptide or fusion protein not found in nature. 



A specific DNA <SEQ ID NO: 19) described in more detail below encodes a 
polypeptide or a fusion protein and comprises nucleotides encoding residues 1- 
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1171 of a botulinum toxin type B. Said polypeptide comprises the light chain 
domain and the first 728 amino acid residues of the amino terminal protein of a 
botulinum type B heavy chain. This recombinant product is designated LH 728 /B (SEQ 
ID NO: 20). 



The invention thus also provides a method of manufacture of a polypeptide 
comprising expressing in a host cell a DNA according to the third aspect of the 
invention. The host cell is suitably not able to cleave a polypeptide or fusion 
protein of the invention so as to separate light and heavy toxin chains; for example, 
a non-clostridial host. 

The invention further provides a method of manufacture of a polypeptide 
comprising expressing in a host cell a DNA encoding a fusion protein as described 
above, purifying the fusion protein by elution through a chromatography column 
adapted to retain the fusion protein, eluting through said chromatography column 
a ligand adapted to displace the fusion protein and recovering the fusion protein. 
Production of substantially pure fusion protein is thus made possible. Likewise, the 
fusion protein is readily cleaved to yield a polypeptide of the invention, again in 
substantially pure form, as the second polypeptide may conveniently be removed 
using the same type of chromatography column. 

The LH N /A derived from dichain native toxin requires extended digestion with 
trypsin to remove the C-terminal 1/2 of the heavy chain, the H c domain. The loss 
of this domain effectively renders the toxin inactive in vivo by preventing its 
interaction with host target cells. There is, however, a residual toxic activity which 
may indicate a contaminating, trypsin insensitive, form of the whole type A 
neurotoxin. 

In contrast, the recombinant preparations of the invention are the product of a 
discreet, defined gene coding sequence and can not be contaminated by full length 
toxin protein. Furthermore, the product as recovered from £. coli, and from other 
recombinant expression hosts, is an inactive single chain peptide or if expression 
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hosts produce a processed, active polypeptide it is not a toxin. Endopeptidase 
activity of LH 423 /A, as assessed by the current in vitro peptide cleavage assay, is 
wholly dependent on activation of the recombinant molecule between residues 430 
and 454 by trypsin. Other proteolytic enzymes that cleave between these two 
residues are generally also suitable for activation of the recombinant molecule. 
Trypsin cleaves the peptide bond C-terminal to Arginine or C-terminal to Lysine and 
is suitable as these residues are found in the 430-454 region and are exposed (see 
Fig. 12). 

The recombinant polypeptides of the invention are potential therapeutic agents for 
targeting to cells expressing the relevant substrate but which are not implicated in 
effecting botulism. An example might be where secretion of neurotransmitter is 
inappropriate or undesirable or alternatively where a neuronal cell is hyperactive in 
terms of regulated secretion of substances other than neurotransmitter. In such an 
example the function of the H c domain of the native toxin could be replaced by an 
alternative targeting sequence providing, for example, a cell receptor ligand and/or 
translocation domain. 

One application of the recombinant polypeptides of the invention will be as a 

WO-A-94/21 300. The recombinant product will also find application as a non-toxic 
standard for the assessment and development of in vitro assays for detection of 
functional botulinum or tetanus neurotoxins either in foodstuffs or in environmental 
samples, for example as disclosed in EP-A-07631 31 . 

A further option is addition, to the C-terminal end of a polypeptide of the invention, 
of a peptide sequence which allows specific chemical conjugation to targeting 
ligands of both protein and non-protein origin. 



|n yet a furth er embodiment an alternative targ eting li gand is added to the N- 
terminus of polypeptides of the invention. Recombinant LH N derivatives have been 
designated that have specific protease cleavage sites engineered at the C-terminus 
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of the LC at the putative trypsin sensitive region and also at the extreme C- 
terminus of the complete protein product. These sites will enhance the activational 
specificity of the recombinant product such that the dichain species can only be 
activated by proteolytic cleavage of a more predictable nature than use of trypsin. 

The LH N enzymatically produced from native BoNT/A is an efficient immunogen and 
thus the recombinant form with its total divorce from any full length neurotoxin 
represents a vaccine component. The recombinant product may serve as a basal 
reagent for creating defined protein modifications in support of any of the above 
areas. 

Recombinant constructs are assigned distinguishing names on the basis of their 
amino acid sequence length and their Light Chain (L-chain, L) and Heavy Chain (H- 
chain, H) content as these relate to translated DNA sequences in the public domain 
or specifically to SEQ ID NO: 2 and SEQ ID NO: 20. The 'LH' designation is 
followed by 7X' where 'X' denotes the corresponding clostridial toxin serotype or 
class, e.g. 'A' for botulinum neurotoxin type A or 'TeTx' for tetanus toxin. 
Sequence variants from that of the native toxin polypeptide are given in parenthesis 
in standard format, namely the residue position number prefixed by the residue of 
the native sequence and suffixed by the residue of the variant. 

Subscript number prefixes indicate an amino-terminal (N-terminal) extension, or 
where negative a deletion, to the translated sequence. Similarly, subscript number 
suffixes indicate a carboxy terminal (C-terminal) extension or where negative 
numbers are used, a deletion. Specific sequence inserts such as protease cleavage 
sites are indicated using abbreviations, e.g. Factor Xa is abbreviated to FXa. L- 
chain C-terminal suffixes and H-chain N-terminal prefixes are separated by a '/' to 
indicate the predicted junction between the L and H-chains. Abbreviations for 
engineered ligand sequences are prefixed or suffixed to the clostridial L-chain or H- 
chain corresponding to their position in the translation product. 



Following this nomenclature, 
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SEQ ID NO: 2, containing the entire L-chain and 423 
amino acids of the H-chain of botulinum neurotoxin type 
A; 

a variant of this molecule, containing a two amino acid 
extension to the N-terminus of the L-chain; 

a further variant in which the molecule contains a two 
amino acid extension on the N-terminus of both the L- 
chain and the H-chain; 

a further variant containing a two amino acid extension 
to the N-terminus of the L-chain, and a Factor Xa 
cleavage sequence at the C-terminus of the L-chain 
which, after cleavage of the molecule with Factor Xa 
leaves a two amino acid N-terminal extension to the Hi- 
cham component; and 

2Lp Xa/ 2H 423 /A-IGF- 1 = a variant of this molecule which has a further C-terminal 

exxension xo rne n-cnam, in tnis example tne msuiin-like 
growth factor 1 (IGF-1) sequence. 

There now follows description of specific embodiments of the invention, illustrated 
by drawings in which: 

Fig. 1 shows a schematic representation of the domain 

structure of botulinum neurotoxin type A (BoNT/A); 

Eig.„2 shaws„a„s£hemato^ 

an embodiment of the invention designated LH 423 /A; 
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LH 423 /A 



2 LH 423 /A 



2 L /2 H 42 3/A 



2^X3/2^423^ 
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Fig. 3 



Fig. 4 



Fig. 5 



Fig. 6 



Fig. 7 



Fig. 8 
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is a graph comparing activity of native toxin, trypsin generated 
"native" LH N /A and an embodiment of the invention designated 
2 LH 423 /A (Q 2 E,N 26 K, A 27 Y) in an in vitro peptide cleavage assay; 

is a comparison of the first 33 amino acids in published 
sequences of native toxin and embodiments of the invention; 



shows the transition region of an embodiment of the 
invention designated L/ 4 H 423 /A illustrating insertion of 
four amino acids at the N-terminus of the H N sequence; 
amino acids coded for by the Eco 47 III restriction 
endonuclease cleavage site are marked and the H N 
sequence then begins ALN...; 

shows the transition region of an embodiment of the invention 
designated L FXa / 3 H 4 2 3 /A illustrating insertion of a Factor Xa 
cleavage site at the C-terminus of the L-chain, and three 
additional amino acids coded for at the N-terminus of the H- 
sequence; the N-terminal amino acid of the cleavage-activated 
H N will be cysteine; 

shows the C-terminal portion of the amino acid sequence of an 
embodiment of the invention designated L fXa/3 H 423 /A-IGF-1 , a 
fusion protein; the IGF-1 sequence begins at position G 882 ; 

shows the C-terminal portion of the amino acid sequence of an 
embodiment of the invention designated L FXay3 H 423 /A-CtxA14, 
a fusion protein; the C-terminal CtxA sequence begins at 
position Q 882 ; 



Fig. 9 



shows the C-terminal portion of the amino acid sequence ofan 
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embodiment of the invention designated L FXa/3 H 423 /A-ZZ a 
fusion protein; the C-terminal ZZ sequence begins at position 
A 890 immediately after a genenase recognition site (underlined); 

show schematic representations of manipulations of 
Figs. 10 & 11 polypeptides of the invention; Fig. 10 shows LH 4 /A 
with N-terminal addition of an affinity purification 
peptide (in this case GST) and C-terminal addition of an 
Ig binding domain; protease cleavage sites R1, R2 and 
R3 enable selective enzymatic separation of domains; 
Fig. 11 shows specific examples of protease cleavage 
sites R1, R2 and R3 and a C-terminal fusion peptide 
sequence; 

Fi 9- 12 shows the trypsin sensitive activation region of a 

polypeptide of the invention; 



Fi 9- 13 shows Western blot analysis of recombinant LH 107 /B 

w~ K , w^v,^ n W ,i, i_. <~<ju , yai id /~\ wdb prooea wun anti- 
BoNT/B antiserum; Lane 1 , molecular weight standards; 
lanes 2 & 3, native BoNT/B; lane 4, immunopurified 
LH 107 /B; panel B was probed with anti-T7 peptide tag 
antiserum; lane 1 f molecular weight standards; lanes 2 
& 3, positive control E.coli T7 expression; lane 4 
immunopurified LH 107 /B. 

The sequence listing that accompanies this application contains the following 
sequences:- 



SEQ ID NO: Sequence 

1 DNA coding for LH 423 /A 
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2 LH 423 /A 

3 DNA coding for 23 LH 423 /A (Q 2 E,N 26 K,A 27 Y), of which an 

N-terminal portion is shown in Fig. 4. 

4 23 LH 423 /A (Q 2 E ( N 26 K,A 27 Y) 

5 DNA coding for 2 LH 423 /A (Q 2 E,N 26 K, A 27 Y), of which an N- 

terminal portion is shown in Fig. 4 

6 2 LH 423 /A (Q 2 E,N 26 K,A 27 Y) 

7 DNA coding for native BoNT/A according to Binz et al 

8 native BoNT/A according to Binz et al 

9 DNA coding for L /4 H 423 /A 

1 0 L, 4 H 423 /A 

1 l DNA coding for L FXa / 3 H 423 /A 

1 2 LFXa/ 3 H 4 23/A 

13 DNA coding for L FXa / 3 H 423 /A-IGF-1 

14 L FXa / 3 H 423 /A-IGF-1 

15 DNA coding for L FXa / 3 H 423 /A-CtxA1 4 

16 L FXa / 3 H 423 /A-CtxA14 

1 7 DNA coding for L FXa ^H 423 /A-ZZ 

1 8 L FXa/3 H 423 /A-ZZ 

1 g DNA coding for LH 728 /B 

20 LH 728 /B 

21 DNA coding for LH 4 , 7 /B 

22 LH 417 /B 

23 DNA coding for LH 107 /B 

24 LH 107 /B 

25 DNA coding for LH 423 /A (Q 2 E,N 26 K,A 27 Y) 

26 LH 423 /A (Q 2 E,N 2S K,A 27 Y) 

27 DNA coding for LH 417 /B wherein the first 274 bases are 
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modified to have an E.coli codon bias 
28 DNA coding for LH 417 /B wherein bases 691-1641 of the 

native BoNT/B sequence have been replaced by a 
degenerate DNA coding for amino acid residues 231 -547 
of the native BoNT/B polypeptide 

Example 1 

A 2616 base pair, double stranded gene sequence (SEQ ID NO; 1) has been 
assembled from a combination of synthetic, chromosomal and 
polymerase-chain-reaction generated DNA (Figure 2). The gene codes for a 
polypeptide of 871 amino acid residues corresponding to the entire light-chain (LC, 
448 amino acids) and 423 residues of the amino terminus of the heavy-chain (H c ) 
of botulinum neurotoxin type A. This recombinant product is designated the LH 423 /A 
fragment (SEQ ID NO: 2}. 

Construction of the recombinant product 

The first 918 base pairs of the recombinant gene were synthesised by 
cuncaienauon uj t>nuf i utiyuuuuieuuues iu generate a coamg sequence wnn an t, 
coli codon bias. Both DNA strands in this region were completely synthesised as 
short overlapping oligonucleotides which were phosphorylated, annealed and 
ligated to generate the full synthetic region ending with a unique Kpn\ restriction 
site. The remainder of the LH 423 /A coding sequence was PCR amplified from total 
chromosomal DNA from Clostridium botulinum and annealed to the synthetic 
portion of the gene. 

The internal PCR amplified product sequences were then deleted and replaced with 
the native, fully sequenced, regions from clones of C. botulinum chromosomal 
origi n t o generate t he final gene con struct. The final c omp o sit i on is synthetic DNA 
(bases 1-913), polymerase amplified DNA (bases 914-1138 and 1976-2616) and 
the remainder is of C. botulinum chromosomal origin (bases 1139-1975). The 
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assembled gene was then fully sequenced and cloned into a variety of E.cofi 
plasmid vectors for expression analysis. 

Expression of the recombinant gene and recovery of protein product 

The DNA is expressed in E. coli as a single nucleic acid transcript producing a 
soluble single chain polypeptide of 99,951 Daltons predicted molecular weight. The 
gene is currently expressed in E. coli as a fusion to the commercially available 
coding sequence of glutathione S-transferase (GST) of Schistosoma japonicum but 
any of an extensive range of recombinant gene expression vectors such as pEZZ1 8, 
pTrc99, pFLAG or the pMAL series may be equally effective as might expression 
in other prokaryotic or eukaryotic hosts such as the Gram positive bacilli, the yeast 
P. pastoris or in insect or mammalian cells under appropriate conditions. 

Currently, E. coli harbouring the expression construct is grown in Luria-Bertani 
broth (L-broth pH 7.0, containing 10 g/l bacto-tryptone, 5 g/l bacto-yeast extract 
and 10 g/l sodium chloride) at 37° C until the cell density (biomass) has an optical 
absorbance of 0.4- 0.6 at 600 nm and the cells are in mid-logarithmic growth 
phase. Expression of the gene is then induced by addition of 
isopropylthio-/?-D-galactosidase (IPTG) to a final concentration of 0.5 mM. 
Recombinant gene expression is allowed to proceed for 90 min at a reduced 
temperature of 25°C. The cells are then harvested by centrifugation, are 
resuspended in a buffer solution containing 10 mM Na 2 HP0 4 , 0.5 M NaCI, 10 mM 
EGTA, 0.25% Tween, pH 7.0 and then frozen at -20°C. For extraction of the 
recombinant protein the cells are_disrupted by sonication. The cell extract is then 
cleared of debris by centrifugation and the cleared supernatant fluid containing 
soluble recombinant fusion protein (GST- LH 423 /A) is stored at -20 °C pending 
purification, A proportion of recombinant material is not released by the sonication 
procedure and this probably reflects insolubility or inclusion body formation. 
Currently we do not extract this material for analysis but if desired this could be 
readily achieved using methods known to those skilled in the art. 
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The recombinant GST- LH 423 /A is purified by adsorption onto a commercially 
prepared affinity matrix of glutathione Sepharose and subsequent elution with 
reduced glutathione. The GST affinity purification marker is then removed by 
proteolytic cleavage and reabsorption to glutathione Sepharose; recombinant 
LH 423 /A is recovered in the non-adsorbed material. 

Construct variants 

A variant of the molecule, LH 423 /A (Q 2 E,N 26 K,A 27 Y) (SEQ ID NO: 26) has been 
produced in which three amino acid residues have been modified within the light 
chain of LH 423 /A producing a polypeptide containing a light chain sequence different 
to that of the published amino acid sequence of the light chain of BoNT/A . 

Two further variants of the gene sequence that have been expressed and the 
corresponding products purified are 2 3 LH 423 /A (Q 2 E,N 26 K,A 27 Y) (SEQ ID NO: 4) 
which has a 23 amino acid N-terminal extension as compared to the predicted 
native L-chain of BoNT/A and 2 LH 423 /A (Q 2 E,N 26 K,A 27 Y) (SEQ ID NO: 6) which has 
a 2 amino acid N-terminal extension (Figure 4). 

restriction site between nucleotides 1 344 and 1 345 of the gene sequence given in 
(SEQ ID NO: 1), This modification provides a restriction site at the position in the 
gene representing the interface of the heavy and light chains in native neurotoxin, 
and provides the capability to make insertions at this point using standard 
restriction enzyme methodologies known to those skilled in the art. It will also be 
obvious to those skilled in the art that any one of a number of restriction sites 
could be so employed, and that the Eco 47 III insertion simply exemplifies this 
approach. Similarly, it would be obvious for one skilled in the art that insertion of 
a restriction site in the manner described could be performed on any gene of the 
invention. The gene described, when expressed, codes for a polypeptide, L /4 H 423 /A 
(SEQ ID NO: 10), which contains an additional four amino acids between amino 
acids 448 and 449 of LH 423 /A at a position equivalent to the amino terminus of the 
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A variant of the gene has been expressed, L FXa/3 H 423 /A (SEQ ID NO: 12), in which 
a specific proteolytic cleavage site was incorporated at the carboxy-terminal end 
of the light chain domain, specifically after residue 448 of L /4 H 423 /A. The cleavage 
site incorporated was for Factor Xa protease and was coded for by modification of 
SEQ ID NO: 1 . It will be apparent to one skilled in the art that a cleavage site for 
another specified protease could be similarly incorporated, and that any gene 
sequence coding for the required cleavage site could be employed. Modification 
of the gene sequence in this manner to code for a defined protease site could be 
performed on any gene of the invention. 

Variants of L FXa/3 H 423 /A have been constructed in which a third domain is present 
at the carboxy-terminal end of the polypeptide which incorporates a specific 
binding activity into the polypeptide. 

Specific examples described are: 

(D L F xa/3 H 423 /A - |GF_1 < SEQ lD N0: 1 4) ' in which the carboxy-terminal domain has 
a sequence equivalent to that of insulin-like growth factor- 1 (IGF-1 ) and is able to 
bind to the insulin-like growth factor receptor with high affinity; 

(2) L FXa/3 H 423 /A-CtxA14 (SEQ ID NO: 1 6) , in which the carboxy-terminal domain 
has a sequence equivalent to that of the 1 4 amino acids from the carboxy-terminus 
of the A-subunit of cholera toxin (CtxA) and is thereby able to interact with the 
cholera toxin B-subunit pentamer; and 

(3) L FXa/3 H 4 23/A-ZZ (SEQ ID NO: 18) , in which the carboxy-terminal domain is a 
tandem repeating synthetic IgG binding domain. This variant also exemplifies 
another modification applicable to the current invention, namely the inclusion in the 
gene of a sequence coding for a protease cleavage site located between the end 
of the clostridial heavy chain sequence and the sequence coding for the binding 
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ligand. Specifically in this example a sequence is inserted at nucleotides 2650 to 
2666 coding for a genenase cleavage site. Expression of this gene produces a 
polypeptide which has the desired protease sensitivity at the interface between the 
domain providing H N function and the binding domain. Such a modification enables 
selective removal of the C-terminal binding domain by treatment of the polypeptide 
with the relevant protease. 

It will be apparent that any one of a number of such binding domains could be 
incorporated into the polypeptide sequences of this invention and that the above 
examples are merely to exemplify the concept. Similarly, such binding domains can 
be incorporated into any of the polypeptide sequences that are the basis of this 
invention. Further,-* should be noted that such binding domains could be 
incorporated at any appropriate location within the polypeptide molecules of the 
invention. 



Further embodiments of the invention are thus illustrated by a DN A of the invention 
further comprising a desired restriction endonuclease site at a desired location and 
by a polypeptide of the invention further comprising a desired protease cleavage 
site at a desired location. 

The restriction endonuclease site may be introduced so as to facilitate further 
manipulation of the DNA in manufacture of an expression vector for expressing a 
polypeptide of the invention; it may be introduced as a consequence of a previous 
step in manufacture of the DNA; it may be introduced by way of modification by 
insertion, substitution or deletion of a known sequence. The consequence of 
modification of the DNA may be that the amino acid sequence is unchanged, or 
may be that the amino acid sequence is changed, for example resulting in 
introduction of a desired protease cleavage site, either way the polypeptide retains 
its first and second"domains having the properties required by the invention. 



Figure 10 is a diagrammatic representation of an expression product exemplifying 
features described in this example. Specifically, it illustrates a single polypeptide 
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incorporating a domain equivalent to the light chain of botulinum neurotoxin type 
A and a domain equivalent to the H N domain of the heavy chain of botulinum 
neurotoxin type A with a N-terminal extension providing an affinity purification 
domain, namely GST r and a C-terminal extension providing a ligand binding domain, 
namely an IgG binding domain. The domains of the polypeptide are spatially 
separated by specific protease cleavage sites enabling selective enzymatic 
separation- of domains as exemplified in the Figure. This concept is more 
specifically depicted in Figure 1 1 where the various protease sensitivities are 
defined for the purpose of example. 

Assay of product activity 

The LC of botulinum neurotoxin type A exerts a zinc-dependent endopeptidase 
activity on the synaptic vesicle associated protein SNAP-25 which it cleaves in a 
specific manner at a single peptide bond. The 2 LH 423 /A (G 2 E,N 26 K,A 27 Y) (SEQ ID 
NO: 6) cleaves a synthetic SNAP-25 substrate in vitro under the same conditions 
as the native toxin (Figure 3). Thus, the modification of the polypeptide sequence 
of 2 LH 423 /A (Q 2 E,N 26 K,A 27 Y) relative to the native sequence and within the minimal 
functional LC domains does not prevent the functional activity of the LC domains. 

This activity is dependent on proteolytic modification of the recombinant 
GST- 2 LH 423 /A (Q 2 E,N 26 K,A 27 Y) to convert the single chain polypeptide product to 
a disulphide linked dichain species. This is currently done using the proteolytic 
enzyme trypsin. The recombinant product (1 00-600 jjgfml) is incubated at 37°C 
for 10-50 minutes with trypsin (10 /vg/ml) in a solution containing 140 mM NaCI, 
2.7 mM KCt, 10 mM Na 2 HP0 4 , 1.8 mM KH 2 P0 4 , pH 7.3. The reaction is 
terminated by addition of a 100-fold molar excess of trypsin inhibitor. The 
activation by trypsin generates a disulphide linked dichain species as determined 
by polyacrylamide gel electrophoresis and immunoblotting analysis using polyclonal 
anti-botulinum neurotoxin type A antiserum. 

LH A23 /A is more stable in the presence of trypsin and more active in the in vitro 
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peptide cleavage assay than is 23 LH 423 /A. Both variants, however, are fully 
functional in the in vitro peptide cleavage assay. This demonstrates that the 
recombinant molecule will tolerate N-terminal amino acid extensions and this may 
be expanded to other chemical or organic moieties as would be obvious to those 
skilled in the art. 



Example 2 



As a further exemplification of this invention a number of gene sequences have 
been assembled coding for polypeptides corresponding to the entire light-chain and 
varying numbers of residues from the amino terminal end of the heavy chain of 
botulinum neurotoxin type B. In this exemplification of the disclosure the gene 
sequences assembled were obtained from a combination of chromosomal and 
polymerase-chain-reaction generated DNA, and therefore have the nucleotide 
sequence of the equivalent regions of the natural genes, thus exemplifying the 
principle that the substance of this disclosure can be based upon natural as well 
as a synthetic gene sequences. 



The gene sequences relating to this example were all assembled and expressed 

ncinn mpthnHnlnniac ac H#»tailiaH in QamKmAl/ I Cri+e/^K C C P. M AH ;.4.:. *r- > a m n rw 

Molecular Cloning: A Laboratory Manual (2nd Edition), Ford N, Nolan C, Ferguson 
M & Ockler M (eds), Cold Spring Harbor Laboratory Press, New York, and known 
to those skilled in the art. 



A- gene has been assembled coding for a polypeptide of 1171 amino acids 
corresponding to the entire light-chain (443 amino acids) and 728 residues from the 
amino terminus of the heavy chain of neurotoxin type B. Expression of this gene 
produces a polypeptide, LH 728 /B (SEQ ID NO: 20), which lacks the specific neuronal 
binding activity of full length BoNT/B, 



A gene has also been assembled coding for a variant polypeptide, LH 4I7 /B (SEQ ID 
NO: 22), which possesses an amino acid sequence at its carboxy terminus 
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equivalent by amino acid homology to that at the carboxy-terminus of the heavy 
chain fragment in native LH N /A . 

A gene has also been assembled coding for a variant polypeptide, LH 107 /B (SEQ ID 
NO: 24) , which expresses at its carboxy-terminus a short sequence from the 
amino terminus of the heavy chain of BoNT/B sufficient to maintain solubility of the 
expressed polypeptide. 

Construct Variants 

A variant of the coding sequence for the first 274 bases of the gene shown in SEQ 
ID NO: 21 has been produced which whilst being a non-native nucleotide sequence 
still codes for the native polypeptide. 

Two double stranded, a 268 base pair and a 951 base pair, gene sequences have 
been created using an overlapping primer PCR strategy. The nucleotide bias of 
these sequences was designed to have an E.co/i codon usage bias. 

For the first sequence, six oligonucleotides representing the first (5') 268 
nucleotides of the native sequence for botulinum toxin type B were synthesised. 
For the second sequence 23 oligonucleotides representing internal sequence 
nucleotides 691*1641 of the native sequence for botulinum toxin type B were 
synthesised. The oligonucleotides ranged from 57-73 nucleotides in length. 
Overlapping regions, ,17-20 nucleotides/ were designed to give melting 
temperatures in the range 52-56°C. In addition, terminal restriction endonuclease 
sites of the synthetic products were constructed to facilitate insertion of these 
products into the exact corresponding region of the native sequence. The 268 bp 
5' synthetic sequence has been incorporated into the gene shown in SEQ ID NO: 
21 in place of the original first 268 bases (and is shown in SEQ ID NO: 27). 
Similarly the sequence could be inserted into other genes of the examples. 



Another variant sequence equivalent to nucleotides 691 to 1 641 of SEQ ID NO: 21 
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, and employing non-native codon usage whilst coding for a native polypeptide 
sequence, has been constructed using the internal synthetic sequence. This 
sequence (SEQ ID NO; 28) can be incorporated, alone or in combination with other 
variant sequences, in place of the equivalent coding sequence in any of the genes 
of the example. 

Example 3 

An exemplification of the utility of this invention is as a non-toxic and effective 
immunogen. The non-toxic nature of the recombinant, single chain material was 
demonstrated by intraperitoneal administration in mice of GST- 2 LH 423 /A. The 
polypeptide was prepared and purified as described above. The amount of 
immunoreactive material in the final preparation was determined by enzyme linked 
immunosorbent assay (ELISA) using a monoclonal antibody (BA 1 1 ) reactive against 
a conformation dependent epitope on the native LH N /A. The recombinant material 
was serially diluted in phosphate buffered saline (PBS; NaCI 8 g/l, KCI 0.2 g/l, 
Na 2 HP0 4 1.15 g/l, KH 2 P0 4 0.2 g/l, pH 7.4) and 0.5 ml volumes injected into 3 
groups of 4 mice such that each group of mice received 10, 5 and 1 micrograms 
of material respectively. Mice were observed for 4 days and no deaths were seen. 

For immunisation, 20 jjg of GST- 2 LH 423 /A in a 1.0 ml volume of water-in-oil 
emulsion (1:1 vol:vol) using Freund's complete (primary injections only) or Freund's 
incomplete adjuvant was administered into guinea pigs via two sub-cutaneous 
dorsal injections. Three injections at 10 day intervals were given (day 1, day 10 
and day 20) and antiserum collected on day 30. The antisera were shown by ELISA 
to be immunoreactive against native botulinum neurotoxin type A and to its 
derivative LH N /A. Antisera which were botulinum neurotoxin reactive at a dilution 
of 1:2000 were used for evaluation of neutralising efficacy in mice. For 
neutralisation assays 0.1 ml of antiserum was diluted into 2.5 ml of gelatine 
phosphate buffer (GPB; Na 2 HP0 4 anhydrous 10 g/l, gelatin (Difco) 2 g/l, pH 6.5- 
6.6) containing a dilution range from 0.5 /yg (5X1 0* 6 g) to 5 picograms (5X1 0* 12 g). 
Aliquots of 0.5 ml were injected into mice intraperitoneal^ and deaths recorded 
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over a 4 day period. The results are shown in Table 1 and Table 2. It can clearly 
be seen that 0.5 ml of 1 :40 diluted anti- GST- 2 LH 423 /A antiserum can protect mice 
against intraperitoneal challenge with botulinum neurotoxin in the range 5 pg - 50 
ng (1 - 10,000 mouse LD50; 1 mouse LD50 = 5 pg). 
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TABLE 1 . Neutralisation of botulinum neurotoxin in mice by guinea p 
anti-GST- 2 LH 423 /A antiserum. 

Botulinum Toxin/mouse 

Survivors 0.5//g 0.005//g 0.0005//g 0.5ng O.OOBng 5pg Control 



On Day (no toxin) 

1 0 4 4 4 4 4 4 

2 4 4 4 4 4 4 

3 4 4 4 4 4 4 

4 4 4 4 4 4 4 



TABLE 2. Neutralisation of botulinum neurotoxin in mice by non-immune guinea 
pig antiserum. 

Botulinum Toxin/mouse 
Survivors 0.5^/g 0.005//g 0.0005//g 0.5ng 0.005ng 5pg Control 



On Day (no toxin) 

1 0 0 0 0 0 2 4 

2 --.04 

3 ' - - - 4 

4 . . . 4 



Example 4 

Expression of recombinant LH 107 /B in E. coli. 

As an exemplification of the expression of a nucleic acid coding for a LH N of a 
■clostridial neurotoxin of a serotype other than botulinum neurotoxin type ATthe 
nucleic acid sequence (SEQ ID NO: 23) coding for the polypeptide LH 107 /B (SEQ ID 
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NO: 24) was inserted into the commercially available plasmid pET28a (Novogen, 
Madison, Wl, USA). The nucleic acid was expressed in E. coli BL21 (DE3) (New 
England BioLabs, Beverley, MA, USA) as a fusion protein with a N-terminal T7 
fusion peptide, under IPTG induction at 1 mM for 90 minutes at 37°C. Cultures 
were harvested and recombinant protein extracted as described previously for 
LH 423 /A. 

Recombinant protein was recovered and purified from bacterial paste lysates by 
immunoaffinity adsorption to an immobilised anti-T7 peptide monoclonal antibody 
using a T7 tag purification kit (New England bioLabs, Beverley, MA, USA). Purified 
recombinant protein was analysed by gradient (4-20%) denaturing SDS- 
polyacrylamide gel electrophoresis (Novex, San Diego, CA, USA) and western 
blotting using polyclonal anti-botulinum neurotoxin type antiserum or anti-T7 
antiserum. Western blotting reagents were from Novex, immunostained proteins 
were visualised using the Enhanced Chemi-Luminescence system (ECL) from 
Amersham. The expression of an anti-T7 antibody and anti-botulinum neurotoxin 
type B antiserum reactive recombinant product is demonstrated in Figure 13. 

The recombinant product was soluble and retained that part of the light chain 
responsible for endopeptidase activity. 



The invention thus provides recombinant polypeptides useful inter alia as 
immunogens, enzyme standards and components for synthesis of molecules as 
described in WO-A-94/21 300. 
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SEQUENCE LISTING 



(1) GENERAL INFORMATION: 

(ij APPLICANT: 

(A) NAME: MICROBIOLOGICAL RESEARCH AUTHORITY 

(B) STREET: Centre For Applied Microbiology And Research 

Porton Down 

(C) CITY: Salisbury 

(D) STATE Wiltshire 

(E) COUNTRY: UK 

(F) POSTAL CODE (ZIP) : SP4 OJG 

(A) NAME: THE SPEYWOOD LABORATORY LIMITED 

(B) STREET: 14 Kensington Square 

(C) CITY: London 
(EJ COUNTRY: UK 

(F) POSTAL CODE (ZIP) : W8 5HH 

(A) NAME: FOSTER; Keith Alan 

(B) STREET: Centre For Applied Microbiology And Research, 

Porton Down 

(C) CITY: Salisbury 

(D) STATE: Wiltshire 
<E) COUNTRY: UK 

(F) POSTAL CODE (ZIP) : SP4 OJG 

(A) NAME: QUINN; Conrad Padraig 

(B) STREET: Centre For Applied Microbiology And Research, 

Porton Down 

(C) CITY: Salisbury 

(D) STATE: Wiltshire 

(E) COUNTRY : UK 

(F) POSTAL CODE (ZIP) : SP4 OJG 

(A) NAME: SHONE; Clifford Charles 

(B) STREET: Centre For Applied Microbiology And Research, 

(C) CITY: Salisbury 

(D) STATE: Wiltshire 
<E) . COUNTRY: UK 

(F) POSTAL CODE (ZIP) : SP4 OJG 
(ii) TITLE OF INVENTION: Recombinant Toxin Fragments 
(iii) NUMBER OF SEQUENCES: 28 

(iv) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPOTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC-DOS/MS-DOS 

(D) SOFTWARE: Patentln Release #1.0, Version #1.30 (EPO) 



(2) INFORMATION FOR SEQ ID NO: 1: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2616 base pairs 
(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
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(ixi FEATURE: 

(A) NAME/ KEY: CDS 

(B) LOCATION : 1 . .2616 



(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 1: 

ATG CAG TTC GTG AAC AAG CAG TTC AAC TAT AAG GAC CCT GTA AAC GGT 4 8 

Met Gin Phe Val Asn Lys Gin Phe Asn Tyr Lys Asp Pro Val Asn Gly 
15 10 15 

GTT GAC ATT GCC TAC ATC AAA ATT CCA AAC GCC GGC CAG ATG CAG CCG 9 6 

Val Asp lie Ala Tyr lie Lys lie Pro Asn Ala Gly Gin Met Gin Pro 
' 20 25 * 30 

GTG AAG GCT TTC AAG ATT CAT AAC AAA ATC TGG GTT ATT CCG GAA CGC 14 4 

Val Lys Ala Phe Lys lie His Asn Lys lie Trp Val lie Pro Glu Arg 
. 35 40 45 

GAT ACA TTT ACG AAC CCG GAA GAA GGA GAC TTG AAC CCG CCG CCG GAA 192 
Asp Thr Phe Thr Asn Pro Glu Glu Gly Asp Leu Asn Pro Pro Pro Glu 
50 55 60 

GCA AAG CAG GTG CCA GTT TCA TAC TAC GAT TCA ACC TAT CTG AGC ACA 24 0 

Ala Lys Gin Val Pro Val Ser Tyr Tyr Asp Ser Thr Tyr Leu Ser Thr 
65 70 * 75 80 

GAC AAC GAG AAG GAT AAC TAC CTG AAG GGA GTG ACC AAA TTA TTC GAG 2 88 

Asp Asn Glu Lys Asp Asn Tyr Leu Lys Gly Val Thr Lys Leu Phe Glu 
85 90 95 

CGT ATT TAT TCC ACT GAC CTG GGC CGT ATG CTG CTG ACC TCA ATC GTC 33 6 

Arg lie Tyr Ser Thr Asp Leu Gly Arg Met Leu Leu Thr Ser lie Val 
100 105 110 

CGC GGA ATC CCA TTT TGG GGT GGC AGT ACC ATT GAC ACG GAG TTG AAG 384 
Arg Gly lie Pro Phe Trp Gly Gly Ser Thr lie Asp Thr Glu Leu Lys 
115 120 125 

GTT ATT GAC ACT AAC TGC ATT AAC GTG ATC CAA CCA GAC GGT AGC TAC 43 2 

Val lie Asp Thr Asn Cys lie Asn Val lie Gin Pro Asp Gly Ser Tyr 
130 135 140 

AGA TCT GAA GAA CTT AAC CTC GTA ATC ATC GGG CCC TCC GCG GAC ATT 4 60 

Arg Ser Glu Glu Leu Asn Leu Val lie lie Gly Pro Ser Ala Asp lie 
145 150 155 160 

ATC CAG TTT GAG TGC AAG AGC TTT GGC CAC GAA GTG TTG AAC CTG ACG 52 8 

lie Gin Phe Glu Cys Lys Ser Phe Gly His Glu Val Leu Asn Leu Thr 
165 170 175 

CGT AAC GGT TAC GGC TCT ACT CAG TAC ATT CGT TTC AGC CCA GAC TTC 576 
Arg Asn Gly Tyr Gly Ser Thr Gin Tyr lie Arg Phe Ser Pro Asp Phe 
180 185 190 

ACG TTC GGT TTC GAG GAG AGC CTG GAG GTT GAT ACC AAC CCG CTG TTG 624 
Thr Phe Gly Phe Glu Glu Ser Leu Glu Val Asp Thr Asn Pro Leu Leu 
195 200 205 

GGT GCA GGC AAG TTC GCA ACT GAT CCA GCG GTG ACC CTG GCA CAC GAG 6 72 

Gly Ala Gly Lys Phe Ala Thr Asp Pro Ala Val Thr Leu' Ala His Glu 
210 215 220 

CTG ATC CAC GCC GGT CAT CGT CTG TAT GGC ATT GCG ATT AAC CCG AAC 72 0 

Leu lie His Ala Gly His Arg Leu Tyr Gly lie Ala lie Asn Pro Asn 
225 230 235 240 
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CGC GTG TTC AAG GTT AAC ACC AAC GCC TAC TAC GAG 
Arg Val Phe Lys Val Asn Thr Asn Ala Tyr Tyr Glu 
245 250 

GAG GAA CTG CGC ACG TTC GGT GGC 
Glu Glu Leu Arq Thr Phe Gly Gly 
265 

GAG TTC CGT CTG 
Glu Phe Arg Leu 



ATG AGT 
Met Ser 



GAA GTA 
Glu Val 



TTT ATC 
Phe He 



AAG TTT 
Lvs Phe 
290 

GGT ACC 
Gly Thr 
305 

TAT CTC 
Tyr Leu 



AGC TTC 
Ser Phe 
260 

GAC AGC 
Asp Ser 
275 

AAA GAT 
Lys Asp 



TTG CAG 
Leu Gin 



ATT GCA 
He Ala 



ACT GCT 
Thr Ala 



CTA TCT 
Leu Ser 



AAA TTT 
Lys Phe 



GAT AAG 
Asp Lys 
340 



TCA TTA 
Ser Leu 
310 

GAA GAT 
Glu Asp 
325 

TTA TAC 

Leu Tyr 



CTG CGC 
Leu Arg 

GAG AAC 
Glu Asn 
280 

AGT ACA 
Ser Thr 
295 

CAG TAT 
Gin Tyr 



CTG AAC 
Leu Asn 



ATG AAA 
Met Lys 



ACA TCT 
Thr Ser 



AAA ATG 
Lys Met 



GGA AAA 
Gly Lys 
330 

TTA ACA 
Leu Thr 
345 



AAG GCT 
Lys Ala 
300 

AAT GTT 
Asn Val 
315 

TTT TCG 
Phe Ser 



CAT GAT 
His Asp 
270 

TAC TAC 
Tyr Tyr 
2 85 

AAG TCC 
Lys Ser 



GGT TTA 
Gly Leu 
255 

GCG AAG 
Ala Lys 

TAC AAC 
Tyr Asn 



ATT GTG 
lie Val 



TTT AAA 
Phe Lys 



GTA GAT 
Val Asp 



GAG ATT 
Glu He 



AAT TTT GTT AAG TTT TTT AAA GTA CTT AAC 



Asn 


Phe 


Val 
355 


Lys 


Phe 


Phe 


Lys 


Val 
360 


Leu 


Asn 


TTT 
Phe 


GAT 
Asp 
370 


AAA 
Lys 


GCC 
Ala 


GTA 
Val 


TTT 
Phe 


AAG 
Lys 
375 


ATA 
He 


AAT 
Asn 


ATA 
lie 


ACA 
Thr 
3B5 


ATA 
lie 


TAT 

Tyr 


GAT 
Asp 


GGA 
Gly 


TTT 
Phe 
390 


AAT 
Asn 


TTA AGA AAT 
Leu Arg Asn 


Phe 


Asn 


Gly 


Gin 


Asn 
405 


Thr 


Glu 


lie 


Asn 


Asn 
410 


AAA 
Lys 


AAT 
Asn 


TTT 
Phe 


ACT 
Thr 
420 


GGA TTG 
Gly Leu 


TTT 
Phe 


GAA 
Glu 


TTT 
Phe 
425 


TAT 
Tyr 


GGG 
Gly 


ATA 
He 


ATA 
lie 
435 


ACT 
Thr 


TCT 
Ser 


AAA. 
Lys 


ACT 
Thr 


AAA 
Lys 
440 


TCA 
Ser 


TTA 
Leu 


GCA 
Ala 


TTA 
Leu 
450 


AAT 
Asn 


GAT 
Asp 


TTA 
Leu 


TGT 
Cys 


ATC 
He 
455 


AAA 

Lys 


GTT 
Val 


AAT 
Asn 


AGT 
Ser 
465 


CCT 
Pro 


TCA 
Ser 


GAA 
Glu 


GAT AAT 
Asp Asn 
470 


TTT 
Phe 


ACT 
Thr 


AAT 
Asn 


GAT 
Asp 


ATT 
He 


ACA 
Thr 


TCT 
Ser 


GAT 
Asp 


ACT 
Thr 


AAT 
Asn 


ATA 
He 


GAA 
Glu 


GCA 
Ala 


GCA 

Ala i 










485 










490 


GAT 
Asp 


TTA 

Leu 


ATA 
He 


CAA 
Gin 
500 


CAA 
Gin 


TAT 
Tyr 


TAT 
Tyr 


TTA 
Leu 


ACC 
Thr 
505 


TTT , 
Phe , 



AGA AAA 
Arg Lys 



TAC ACA 
Tyr Thr 
350 

ACA TAT 
Thr Tyr 
365 



380 



AAG GTA 
Lys Val 



GAG AAA 
Glu Lys 
320 

AAA TTA 
Lys Leu 
335 

GAG GAT 
Glu Asp 



TTG AAT 
Leu Asn 



AAT TAC 
Asn Tyr 



395 



460 



TTA GCA 
Leu Ala 



-t J. i t~L<- ± 

Phe Thr 



CTA TGT 
Leu Cys 
430 

GGA TAC 
Gly Tyr 
445 

GAC TTG 
Asp Leu 



GCA 
Ala 



Lys 
415 

GTA 
Val 



AAC 
Asn 
400 

^ irt 
Leu 



AGA 
Arg 



AAT AAG 
Asn Lys 



TTT TTT 
Phe Phe 



475 



AAA GGA GAA GAA 
Lys Gly Glu Glu 
480 

AAT ATT AGT TTA 
Asn lie Ser Leu 
495 

GAT AAT GAA CCT 
Asp Asn Glu Pro 
510 



768 



816 



864 



912 



960 



1008 



1056 



1104 



1152 



1200 



1296 



1344 



1392 



1440 



1488 



1536 
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GAA AAT ATT TCA ATA GAA AAT CTT TCA AGT GAC ATT ATA GGC CAA TTA 1584 
S£ Jsn lie Ser He Glu Asn Leu Ser Ser Asp He lie Gly Gin Leu 
515 52C 525 

GAA CTT ATG CCT AAT ATA GAA AGA TTT CCT AAT GGA AAA AAG TAT GAG 163 2 

S5 III Sec Pro Asn He Glu Arg Phe Pro Asn Gly Lys Lys Tyr Glu 
530 535 



TTA GAT AAA TAT ACT ATG TTC CAT TAT CTT CGT GCT CAA GAA TTT GAA 
Leu Asp Lys Tyr Thr Met Phe His Tyr Leu Arg Ala Gin Glu Phe Glu 



545 550 



r»T GGT AAA TCT AGG ATT GCT TTA ACA AAT TCT GTT AAC GAA GCA TTA 
His S5 i^s Ser Arg He Ala Leu Thr Asn Ser Val Asn Glu Ala Leu 
565 570 

TTA AAT CCT AGT CGT GTT TAT ACA TTT TTT TCT TCA GAC TAT GTA AAG 
llu Jin Sro Ser Arg Val Tyr Thr Phe Phe Ser Ser Asp Tyr Val Lys 



580 



rrr AAT AAA GCT ACG GAG GCA GCT ATG TTT TTA GGC TGG GTA GAA 
J£ Sal £n i£ S£ Thr Glu Ala Ala Met Phe Leu Gly Trp Val Glu 

595 600 

CAA TTA GTA TAT GAT TTT ACC GAT GAA ACT AGC GAA GTA AGT ACT ACG 
Stn Leu Val Tyr Asp Phe Thr Asp Glu Thr Ser Glu Val Ser Thr Th. 
610 615 620 

GAT AAA ATT GCG GAT ATA ACT ATA ATT ATT CCA TAT ATA GGA CCT GCT 
Sp i£ iS Ala Asp He Thr He He He Pro Tyr He Gly Pro Ala 
625 630 

TTA AAT ATA GGT AAT ATG TTA TAT AAA GAT GAT TTT GTA GGT GCT TTA 
llu ^n nJ Gly Asn Met Leu Tyr Lys Asp Asp Phe Val Gly Ala Leu 
645 650 

ATA TTT TCA GGA GCT GTT ATT CTG TTA GAA TTT ATA CCA GAG ATT GCA 
ni Se Ser Sly Ala Val He Leu Leu Glu Phe He Pro Glu He Ala 
660 665 

ATA CCT GTA TTA GGT ACT TTT GCA CTT GTA TCA TAT ATT GCG AAT AAG 
lie Pro SIl Hu Sly Thr Phe Ala Leu Val Ser Tyr lie Ala Asn Lys 



675 



660 



GTT CTA ACC GTT CAA ACA ATA GAT AAT GCT TTA AGT AAA AGA AAT GAA 
Sal llu 5hr vli Gin Thr He Asp Asn Ala Leu Ser Lys Arg Asn Glu 

690 695' fyjyj 

t£ £ ?S £ £ £ - SS f£ S£ IS 2S S tS 



705 710 



ss is s as i ss a "i s ^ & s? S Si 1? ™ 
ss us si ss as s js ss s ?a s as ^ §s ™ s: 

740 745 

CAA TAT ACT GAG GAA GAG AAA AAT AAT ATT AAT TTT AAT ATT GAT GAT 
Gin Tyr Thr Glu Glu Glu Lys Asn Asn He Asn Phe Asn f f 



755 



TTA AGT TCG AAA CTT AAT GAG TCT ATA AAT AAA GCT ATG ATT AAT ATA 
Leu Ser Ser Lys Leu Asn Glu Ser He Asn Lys Ala Met 
770 775 



1680 



1728 



1776 



1824 



1872 



1920 



1968 



2016 



2064 



2112 



2160 



2208 



2256 



2304 



2352 
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AAT AAA TTT TTG 
Asn Lys Phe Leu 
785 

ATC CCT TAT GGT 
lie Pro Tyr Gly 



GAT GCA 
Asd Ala 



CAA GTA 
Gin Val 



ATA CCT 
lie Pro 
850 

ACA TTT 
Thr Phe 
865 



TTA TTA 
Leu Leu 
820 

GAT AGA 
Asp Arg 
835 

TTT CAG 
Phe Gin 



ACT GAA 
Thr Glu 



AAT CAA TGC 
Asn Gin Cys 
790 

GTT AAA CGG 
Val Lys Arg 
805 

AAG TAT ATA 
Lys Tyr lie 



TCT GTT TCA 
Ser Val Ser 



TTA AAA GAT 
Leu Lys Asp 



CTT TCC AAA 
Leu Ser Lys 
855 

TAT ATT AAG 
Tyr He Lys 
870 



TTA GAA GAT 
Leu Glu Asp 
310 

TAT GAT AAT 
Tyr Asd Asn 
825 

AAA GTT AAT 
Lys Val Asn 
840 

TAC GTA GAT 
Tyr Val Asp 



TAT TTA ATG AAT 
Tyr Leu Mec Asn 
795 

TTT GAT GCT AGT 
Phe Asp Ala Ser 



AGA GGA ACT TTA 
Arg Gly Thr Leu 
830 

AAT ACA CTT AGT 
Asn Thr Leu Ser 
84S 

AAT CAA AGA TTA 
Asn Gin Arg Leu 
860 



TCT ATG 
Ser Met 
800 

CTT AAA 
Leu Lys 
815 

ATT GGT 
He Gly 



ACA GAT 
Thr Asp 



TTA TCT 
Leu Ser 



TAA 



2400 



2448 



2496 



2544 



2592 



2616 



(2) INFORMATION FOR SEQ ID NO: 2: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 872 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 

Met Gin Phe Val Asn Lys Gin Phe Asn Tyr Lys Asp Pro Val Asn Gly 
15 io is 

Val Asp He Ala Tyr He Lys He Pro Asn Ala Gly Gin Met Gin Pro 
20 25 30 

Val Lys Ala Phe Lys He His Asn Lys lie Trp Val He Pro Glu Arq 
35 40 45 

Asp Thr Phe Thr Asn Pro Glu Glu Gly Asp Leu Asn Pro Pro Pro Glu 
50 55 60 

Ala Lys Gin Val Pro Val Ser Tyr Tyr Asp Ser Thr Tyr Leu Ser Thr 
65 70 75 80 

Asp Asn Glu Lys Asp Asn Tyr Leu Lys Gly Val Thr Lys Leu Phe Glu 
85 90 95 

Arg lie Tyr Ser Thr Asp Leu Gly Arg Met Leu Leu Thr Ser lie Val 
100 105 no 

Arg Gly He Pro Phe Trp Gly Gly Ser Thr lie Asp Thr Glu Leu Lys 
115 120 125 

Val lie Asp Thr Asn Cys lie Asn Val lie Gin Pro Asp Gly Ser Tvr 
130 135 140 



Arg Ser Glu Glu Leu Asn Leu Val lie lie Gly Pro Ser Ala Asp lie 

145 150 155 i 6 o 

lie Gin Phe Glu Cys Lys Ser Phe Gly His Glu Val Leu Asn Leu Thr 

165 170 175 
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Arq Asn Gly Tyr Gly Ser Thr Gin Tyr He Arg Phe Ser Pro Asp Phe 
180 185 190 

Thr Phe Gly Phe Glu Glu Ser Leu Glu Val Asp Thr Asn Pro Leu Leu 
195 200 205 

Glv Ala Gly Lys Phe Ala Thr Asp Pro Ala Val Thr Leu Ala His Glu 
- 210 215 220 

Leu He His Ala Gly His Arg Leu Tyr Gly He Ala He Asn Pro Asn 
225 230 235 240 

Arq Val Phe Lys Val Asn Thr Asn Ala Tyr Tyr Glu Mer Ser Gly Leu 
245 250 255 

clu Val Ser Phe Glu Glu Leu Arg Thr Phe Gly Gly His Asp Ala Lys 
260 265 270 

Phe He Asp Ser Leu Gin Glu Asn Glu Phe Arg Leu Tyr Tyr Tyr Asn 
275 280 285 

Lys Phe Lys Asp He Ala Ser Thr Leu Asn Lys Ala Lys Ser He Val 
290 295 300 

Gly Thr Thr Ala Ser Leu Gin Tyr Met Lys Asn Val Phe Lys Glu Lys 
305 310 315 320 

Tvr Leu Leu Ser Glu Asp Thr Ser Gly Lys Phe Ser Val Asp Lys Leu 
Y 325 330 335 

Lys Phe Asp Lys Leu Tyr Lys Met Leu Thr Glu He Tyr Thr Glu Asp 
340 345 350 

Asn Phe Val Lys Phe Phe Lys Val Leu Asn Arg Lys Thr Tyr Leu Asn 
355 360 365 

Phe Asp Lys Ala Val Phe Lys He Asn He Val Pro Lys Val Asn Tyr 
370 375 380 

Thr He Tvr Asp Gly Phe Asn Leu Arg Asn Thr Asn Leu Ala Ala Asn 
385 390 395 400 

Phe Asn Gly Gin Asn Thr Glu He Asn Asn Met Asn Phe Thr Lys Leu 
405 410 415 

Lys Asn Phe Thr Gly Leu Phe Glu Phe Tyr Lys Leu Leu Cys Val Arg 
Y 420 425 430 

Glv He He Thr Ser Lys Thr Lys Ser Leu Asp Lys Gly Tyr Asn Lys 
435 440 445 

Ala Leu Asn Asp Leu Cys He Lys Val Asn Asn Trp Asp Leu Phe Phe 
450 455 460 

Ser Pro Ser Glu Asp Asn Phe Thr Asn Asp Leu Asn Lys Gly Glu Glu 
465 470 475 480 

He Thr Ser Asp Thr Asn He Glu Ala Ala Glu Glu Asn He Ser Leu 
490 49b 



485 



Asp Leu He Gin Gin Tyr Tyr Leu Thr Phe Asn Phe Asp Asn Glu Pro 
500 505 51 

Glu Asn He Ser He Glu Asn Leu Ser Ser Asp He He Gly Gin Leu 
515 520 525 
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Glu Leu Met Pro Asn He Glu Arg Phe Pro Asn Gly Lys Lys Tyr Glu 
530 535 540 

Leu Asp Lys Tyr Thr Met Phe His Tyr Leu Arg Ala Gin Glu Phe Glu 
545 550 555 560 

His Gly Lys Ser Arg He Ala Leu Thr Asn Ser Val Asn Glu Ala Leu 
565 570 575 

Leu Asn Pro Ser Arg Val Tyr Thr Phe Phe Ser Ser Asp Tyr Val Lys 
580 585 590 

Lys Val Asn Lys Ala Thr Glu Ala Ala Met Phe Leu Gly Trp Val Glu 
595 600 605 

Gin Leu Val Tyr Asp Phe Thr Asp Glu Thr Ser Glu Val Ser Thr Thr 
610 615 620 

Asp Lys He Ala Asp He Thr He He He Pro Tvr lie. Gly Pro Ala 
625 630 635 ' 640 

Leu Asn lie Gly Asn Met Leu Tyr Lys Asp Asp Phe Val Gly Ala Leu 
645 650 655 

He Phe Ser Gly Ala Val He Leu Leu Glu Phe He Pro Glu He Ala 
660 665 670 

lie Pro Val Leu Gly Thr Phe Ala Leu Val Ser Tyr lie Ala Asn Lys 
675 680 . 685 

Val Leu Thr Val Gin Thr He Asp Asn Ala Leu Ser Lys Arg Asn Glu 
690 695 700 

Lys Trp Asp Glu Val Tyr Lys Tyr lie Val Thr Asn Trp Leu Ala Lys 
705 710 715 720 

Val Asn Thr Gin He Asp Leu lie Arg Lys Lys Met Lys Glu Ala Leu 
725 730 735 

Glu Asn Gin Ala Glu Ala Thr Lys Ala He lie Asn Tyr Gin Tyr Asn 

Gin Tyr Thr Glu Glu Glu Lys Asn Asn lie Asn Phe Asn lie Asp Asp 
755 760 765 

Leu Ser Ser Lys Leu Asn Glu Ser lie Asn Lys Ala Met lie Asn lie 
770 775 780 

Asn Lys Phe Leu Asn Gin Cys Ser Val Ser Tyr Leu Met Asn Ser Met 
785 790 795 800 

lie Pro Tyr Gly Val Lys Arg Leu Glu Asp Ph~e Asp Ala Ser Leu Lys 
805 810 815 

Asp Ala Leu Leu Lys Tyr He Tyr Asp Asn Arg Gly Thr Leu lie Gly 
820 825 830 

Gin Val Asp Arg Leu Lys Asp Lys Val Asn Asn 'Thr Leu Ser Thr Asp 
835 840 845 

lie Pro Phe Gin Leu Ser Lys Tyr Val Asp Asn Gin Arg Leu Leu Ser 
850 855 860 



Thr Phe Thr Glu Tyr He Lys * 
865 870 



(2) INFORMATION FOR SEQ ID NO: 3: 
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(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 2685 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(:x) FEATURE: 

(A) NAME/ KEY: CDS 

(B) LOCATION: 1 . .2685 



txi) SEQUENCE DESCRIPTION: SEQ ID NO : 3 



50 



GGA TCC CCA GGA ATT CAT ATG ACG TCG ACG CGT CTG CAG AAG CTT CTA 4 8 

Glv Ser Pro Gly He His Met Thr Ser Thr Arg Leu Gin Lys Leu Leu 
iS 10 15 

GAA TTC GAG CTC CCG GGT ACC ATG GAG TTC GTG AAC AAG CAG TTC AAC 96 
Glu Phe Glu Leu Pro Gly Thr Met Glu Phe Val Asn Lys Gin Phe Asn 
20 25 30 

TAT AAG GAC CCT GTA AAC GGT GTT GAC ATT GCC TAC ATC AAA ATT CCA 144 
Tyr Lys Asp Pro Val Asn Gly Val Asp He Ala Tyr He Lys He Pro 
35 40 4 5 

AAG TAC GGC CAG ATG CAG CCG GTG AAG GCT TTC AAG ATT CAT AAC AAA 192 
L Tv-- Gly Gin Met Gin Pro Val Lys Ala Phe Lys He His Asn Lys 
7 — 55 60 



ATC TGG GTT ATT CCG GAA CGC GAT AC A TTT ACG AAC CCG GAA GAA GGA 24 0 

He Trp Val He Pro Glu Arg Asp Thr Phe Thr Asn Pro Glu Glu Gly 
65 70 75 80 

GAC TTG AAC CCG CCG CCG GAA GCA AAG CAG GTG CCA GTT TCA TAC TAC 28 8 

Asp Leu Asn Pro Pro Pro Glu Ala Lys Gin Val Pro Val Ser Tyr Tyr 

85 90 95 

GAT TCA ACC TAT CTG AGC ACA GAC AAC GAG AAG GAT AAC TAC CTG AAG 33 6 

Asp Ser Thr Tyr Leu Ser Thr Asp Asn Glu Lys Asp Asn Tyr Leu Lys 
100 105 HO 

GGA GTG ' ACC AAA TTA TTC GAG CGT ATT TAT TCC ACT GAC CTG GGC CGT 3 84 

Gly Val Thr Lys Leu Phe Glu Arg He Tyr Ser Thr Asp Leu Gly Arg 
115 120 125 

ATG CTG CTG ACC TCA ATC GTC CGC GGA ATC CCA TTT TGG GGT GGC AGT 43 2 

Met Leu Leu Thr Ser He Val Arg Gly He Pro Phe Trp Gly Gly Ser 
130 135 140 

ACC ATT GAC ACG GAG TTG AAG GTT ATT GAC ACT AAC TGC ATT AAC GTG 4 80 

Thr He Asp Thr Glu Leu Lys Val He Asp Thr Asn Cys He Asn Val 
145 ISO 155 160 

ATC CAA CCA GAC GGT AGC TAC AGA TCT GAA GAA CTT AAC CTC GTA ATC 52 8 

He Gin Pro Asp Gly Ser Tyr Arg Ser Glu Glu Leu Asn Leu Val lie 
165 1*70 175 

ATC GGG CCC TCC GCG GAC ATT ATC CAG TTT GAG TGC AAG AGC TTT GGC 57 6 

He Gly Pro Ser Ala Asp He He Gin Phe Glu Cys Lys Ser Phe Gly 
180 185 190 

CAC GAA GTG TTG AAC CTG ACG CGT AAC GGT TAC GGC TCT ACT CAG TAC 6 24 

His Glu Val Leu Asn Leu Thr Arg Asn Gly Tyr Gly Ser Thr Gin Tyr 
195 200 205 
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ATT CGT TTC AGC CCA GAC TTC ACG TTC GGT TTC GAG GAG AGC CTG GAG 
lie Arg Phe Ser Pro Asp Phe Thr Phe Gly Phe Glu Glu Ser Leu Glu 
210 215 220 



672 



GTT GAT ACC AAC CCG CTG TTG GGT GCA GGC AAG TTC GCA ACT GAT CCA 
Val Asp Thr Asn Pro Leu Leu Gly Ala Gly Lys Phe Ala Thr Asp Pro 
225 230 235 240 

GCG GTG ACC CTG GCA CAC GAG CTG ATC CAC GCC GGT CAT CGT CTG TAT 
Ala Val Thr Leu Ala His Giu Leu He His Ala Glv His Arg Leu Tyr 
245 250 " 255 



720 



768 



GGC ATT GCG ATT AAC CCG AAC CGC GTG TTC AAG GTT AAC ACC AAC GCC 
Gly He Ala He Asn Pro Asn Arg Val Phe Lys Val Asn Thr Asn Ala 
260 265 270 

TAC TAC GAG ATG AGT GGT TTA GAA GTA AGC TTC GAG GAA CTG CGC ACG 
Tyr Tyr Glu Met Ser Gly Leu Glu Val Ser Phe Glu Glu Leu Arg Thr 
275 280 285 



816 



864 



TTC GGT GGC CAT GAT GCG AAG TTT ATC GAC AGC TTG CAG GAG AAC GAG 
Phe Gly Gly His Asp Ala Lys Phe He Asp Ser Leu Gin Glu Asn Glu 
290 295 300 



912 



TTC CGT CTG TAC TAC TAC AAC AAG TTT AAA GAT ATT GCA AGT ACA CTG 
Phe Arg Leu Tyr Tyr Tyr Asn Lys Phe Lys Asp He Ala Ser Thr Leu 
305 310 315 320 

AAC AAG GCT AAG TCC ATT GTG GGT ACC ACT GCT TCA TTA CAG TAT ATG 
Asn Lys Ala Lys Ser He Val Gly Thr Thr Ala Ser Leu Gin Tyr Met 
325 330 335 



960 



1008 



AAA AAT GTT TTT AAA GAG AAA TAT CTC CTA TCT GAA GAT ACA TCT GGA 
Lys Asn Val Phe Lys Glu Lys Tyr Leu Leu Ser Glu Asp Thr Ser Gly 
340 345 350 



1056 



AAA TTT TCG GTA GAT AAA TTA AAA TTT GAT AAG TTA TAC AAA ATG TTA 
Lys Phe Ser Val Asp Lys Leu Lys Phe Asp Lys Leu Tyr Lys Met Leu 
355 360 365 



1104 



ACA GAG ATT TAC ACA GAG GAT A AT TTT fJTT nan ttt ttt aaa n-rn 
Thr Glu He Tyr Thr Glu Asp Asn Phe Val Lys Phe Phe Lys Val Leu 
370 375 380 



AAC AGA AAA ACA TAT TTG AAT TTT GAT AAA GCC GTA TTT AAG ATA AAT 

Asn Arg Lys Thr Tyr Leu Asn Phe Asp Lys Ala Val Phe Lys He Asn 

385 390 395 400 

ATA GTA CCT AAG GTA AAT TAC ACA ATA TAT GAT GGA TTT AAT TTA AGA 

lie Val Pro Lys Val Asn Tyr Thr He Tyr Asp Gly Phe Asn Leu Arg 

405 410 415 



1200 



1248 



AAT ACA AAT TTA GCA GCA AAC TTT AAT GGT CAA AAT ACA GAA ATT AAT 
Asn Thr Asn Leu Ala Ala Asn Phe Asn Gly Gin Asn Thr Glu He Asn 
420 425 430 



1296 



AAT ATG AAT TTT ACT AAA CTA AAA AAT TTT ACT GGA TTG TTT GAA TTT 
Asn Met Asn Phe Thr Lys Leu Lys Asn Phe Thr Gly Leu Phe Glu Phe 
435 440 445 



1344 



TAT AAG TTG CTA TGT GTA AGA GGG ATA ATA ACT TCT AAA ACT AAA TCA 
Tyr Lys Leu Leu Cys Val Arg Gly He He Thr Ser Lys Thr Lys Ser 

4-50 455 4-6-6 



1392 



TTA GAT AAA GGA TAC AAT AAG GCA TTA AAT GAT TTA TGT ATC AAA GTT 
Leu Asp Lys Gly Tyr Asn Lys Ala Leu Asn Asp Leu Cys lie' Lys Val 
465 470 475 480 



1440 
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AAT AAT TGG GAC TTG TTT TTT AGT CCT TCA GAA GAT AAT TTT ACT AAT 14 88 

Asn Asn Trp Asp Leu Phe Phe Ser Pro Ser Glu Asp Asn Phe Thr Asn 
485 490 ~ 495 

GAT CTA AAT AAA GGA GAA GAA ATT ACA TCT GAT ACT AAT ATA GAA GCA 15 3 6 

Asp Leu Asn Lys Gly Glu Glu lie Thr Ser Asp Thr Asn lie Glu Ala 
500 505 510 

GCA GAA GAA AAT ATT AGT TTA GAT TTA ATA CAA CAA TAT TAT TTA ACC 15 84 

Ala Glu Glu Asn lie Ser Leu Asp Leu lie Gin Gin Tyr Tyr Leu Thr 
515 520 525 

TTT AAT TTT GAT AAT GAA CCT GAA AAT ATT TCA ATA GAA AAT CTT TCA 163 2 

Phe Asn Phe Asp Asn Glu Pro Glu Asn lie Ser lie Glu Asn Leu Ser 
530 " 535 540 

AGT GAC ATT ATA GGC CAA TTA GAA CTT ATG CCT AAT ATA GAA AGA TTT 16 80 

Ser Asp lie lie Gly Gin Leu Glu Leu Met Pro Asn lie Glu Arg Phe 
545 550 555 560 

CCT AAT GGA AAA AAG TAT GAG TTA GAT AAA TAT ACT ATG TTC CAT TAT 17 2 8 

Pro Asn Gly Lys Lys Tyr Glu Leu Asp Lvs Tyr Thr Met Phe His Tyr 
565 570 575 

CTT CGT GCT CAA GAA TTT GAA CAT GGT AAA TCT AGG ATT GCT TTA ACA 17 76 

Leu Arg Ala Gin Glu Phe Glu His Gly Lys Ser Arg lie Ala Leu Thr 
580 585 * 590 

AAT TCT GTT AAC GAA GCA TTA TTA AAT CCT AGT CGT GTT TAT ACA TTT 1824 
Asn Ser Val Asn Glu Ala Leu Leu Asn Pro Ser Arg Val Tyr Thr Phe 
595 600 605 

TTT TCT TCA GAC TAT GTA AAG AAA GTT AAT AAA GCT ACG GAG GCA GCT 18 72 

Phe Ser Ser Asp Tyr Val Lys Lys Val Asn Lys Ala Thr Glu Ala Ala 
610 615 620 

ATG TTT TTA GGC TGG GTA GAA CAA TTA GTA TAT GAT TTT ACC GAT GAA 1920 
Met Phe Leu Gly Trp Val Glu Gin Leu Val Tyr Asp Phe Thr Asp Glu 
625 630 635 640 

ACT AGC GAA GTA AGT ACT ACG GAT AAA ATT GCG GAT ATA ACT ATA ATT 196 8 

Thr Ser Glu Val Ser Thr Thr Asp Lys lie Ala Asp lie Thr lie lie 
645 650 655 

ATT CCA TAT ATA GGA CCT GCT TTA AAT ATA GGT AAT ATG TTA TAT AAA 2016 
lie Pro Tyr He Gly Pro Ala Leu Asn He Gly Asn Met Leu Tyr Lys 
660 665 670 

GAT GAT TTT GTA GGT GCT TTA ATA TTT TCA GGA GCT GTT ATT CTG TTA 20 64 

Asp Asp Phe Val Gly Ala Leu lie Phe Ser Gly Ala Val He Leu Leu 
675 680 685 

GAA TTT ATA CCA GAG ATT GCA ATA CCT GTA TTA GGT ACT TTT GCA CTT 2112 
Glu Phe He Pro Glu He Ala He Pro Val Leu Gly Thr Phe Ala Leu 
690 695 700 

GTA TCA TAT ATT GCG AAT AAG GTT CTA ACC GTT CAA ACA ATA GAT AAT 2160 
Val Ser Tyr He Ala Asn Lys Val Leu Thr Val Gin Thr He Asp Asn 
705 710 715 720 

GCT TTA AGT AAA AGA AAT GAA AAA TGG GAT GAG GTC TAT AAA TAT ATA 2 208 

Ala Leu Ser Lys Arg Asn Glu Lys Trp Asp Glu Val Tyr Lys Tyr He 
725 730 735 

GTA ACA AAT TGG TTA GCA AAG GTT AAT ACA CAG ATT GAT CTA ATA AGA 22 56 

Val Thr Asn Trp Leu Ala Lys Val Asn Thr Gin He Asp Leu He Arg 
740 745 750 
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AAA AAA ATG AAA GAA GCT TTA GAA AAT CAA GCA GAA GCA ACA AAG GCT 2 3 04 

Lys Lys Met Lys Glu Ala Leu Glu Asn Gin Ala Glu Ala Thr Lys Ala 
755 760 765 



ATA ATA AAC TAT CAG TAT AAT CAA TAT ACT GAG GAA GAG AAA AAT AAT 
lie He Asn Tyr Gin Tyr Asn Gin Tyr Thr Glu Glu Glu Lys Asn Asn 
770 775 780 



235: 



ATT AAT TTT AAT ATT GAT GAT TTA AGT TCG AAA CTT AAT GAG TCT ATA 24 0 0 

He Asn Phe Asn He Asp Asp Leu Ser Ser Lys Leu Asn Glu Ser He 
785 790 795 8 00 

AAT AAA GCT ATG ATT AAT ATA AAT AAA TTT TTG AAT CAA TGC TCT GTT 244 8 

Asn Lys Ala Met He Asn He Asn Lys Phe Leu Asn Gin Cys Ser Val 
805 810 815 

TCA TAT TTA ATG AAT TCT ATG ATC CCT TAT GGT GTT AAA CGG TTA GAA 24 96 

Ser Tyr Leu Met Asn Ser Met He Pro Tyr Gly Val Lys Arg Leu Glu 
820 825 830 

GAT TTT GAT GCT AGT CTT AAA GAT GCA TTA TTA AAG TAT ATA TAT GAT 2 54 4 

Asp Phe Asp Ala Ser Leu Lys Asp Ala Leu Leu Lvs Tyr lie Tyr Asp 
835 , 840 845 

AAT AGA GGA ACT TTA ATT GGT CAA GTA GAT AGA TTA AAA GAT AAA GTT 2 59? 

Asn Arg Gly Thr Leu He Gly Gin Val Asp Arg Leu Lys Asp Lys Va 1 
850 855 860 

AAT AAT ACA CTT AGT ACA GAT ATA CCT TTT CAG CTT TCC AAA TAC GTA 264 0 

Asn Asn Thr Leu Ser Thr Asp He Pro Phe Gin Leu Ser Lys Tyr Val 
865 870 875 8B0 

GAT AAT CAA AGA TTA TTA TCT ACA TTT ACT GAA TAT ATT AAG TAA 2 685 

Asp Asn Gin Arg Leu Leu Ser Thr Phe Thr Glu Tyr lie Lys * 
885 890 895 



(2) INFORMATION FOR SEQ ID NO: 4: 

(i) SEQUENCE CHARACTERISTICS: 

( A ) T.PM^TM - pqc; ami r^r^ a ^ ; A «r 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 

Gly Ser Pro Gly lie His Met Thr Ser Thr Arg Leu Gin Lys Leu Leu 
1 5 10 is 

Glu Phe Glu Leu Pro Gly Thr Met Glu Phe Val Asn Lys Gin Phe Asn 
20 25 30 — 

Tyr Lys Asp Pro Val Asn Gly Val Asp He Ala Tyr He Lys lie Pro 
35 40 45 

Lys Tyr Gly Gin Met Gin Pro Val Lys Ala Phe Lys He His Asn Lys 
50 55 60 

lie Trp Val He Pro Glu Arg Asp Thr Phe Thr Asn Pro Glu Glu Gly 
65 70 75 80 

-Asp—Leu— As n— Pr o^~Pro— Pro-Glu—Aira-Lys— G l n V air~Pr o~Va~i~~Si5ir~Ty r~Tyr" 
85 90 95 

Asp Ser Thr Tyr Leu Ser Thr Asp Asn Glu Lvs Asp Asn Tyr Leu Lys 
100 105 " no 
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Gly Val Thr Lys Leu 
115 

Met Leu Leu Thr Ser 
130 

Thr lie Asp Thr Glu 
145 

lie Gin Pro Asp Gly 
165 

. lie Gly Pro Ser. Ala 
- 180 

His Glu Val Leu Asn 
195 

lie Arg Phe Ser Pro 
210 

Val Asp Thr Asn Pro 
225 

Ala Val Thr Leu Ala 
245 

Gly lie Ala lie Asn 
260 

Tyr Tyr Glu Met Ser 
275 

Phe Gly Gly His Asp 
290 

Phe Arg Leu Tyr Tyr 
305 

Asn Lys Ala Lys Ser 
325 

Lys Asn Val Phe Lys 
34 0 

Lys Phe Ser Val Asp 
355 

Thr Glu lie Tyr Thr 
370 

Asn Arg Lys Thr Tyr 
385 

He Val Pro Lys Val 
405 

Asn Thr Asn Leu Ala 
420 

Asn Met Asn Phe Thr 
435 

Tyr Lys Leu Leu Cys 
450 



Phe Glu Arg He Tyr 
120 

He Val Arg Gly He 
135 

Leu Lys Val lie Asp 
150 

Ser Tyr Arg Ser Glu 
170 

Asp He lie Gin Phe 
185 

Leu Thr Arg Asn Gly 
200 

Asp Phe Thr Phe Gly 
215 

Leu Leu Gly Ala Gly 
230 

His Glu Leu He His 
250 

Pro Asn Arg Val Phe 
265 

Gly Leu Glu Val Ser 
280 

Ala Lys Phe He Asp 
295 

Tyr Asn Lys Phe Lys 
310 

He Val Gly Thr Thr 
330 

Glu Lys Tyr Leu Leu 
345 

Lys Leu Lys Phe Asp 
360 

Glu Asp Asn Phe Val 
375 

Leu Asn Phe Asp Lys 
390 

Asn Tyr Thr He Tyr 
410 

Ala Asn Phe Asn Gly 
425 

Lys Leu Lys Asn Phe 
440 

Val Arg Gly He He 
455 . 
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Ser Thr Asp Leu Gly 
125 

Pro Phe Trp Gly Gly 
140 

Thr Asn Cys He Asn 
155 

Glu Leu Asn Leu Val 
175 

Glu Cys Lys Ser Phe 
190 

Tyr Glv Ser Thr Gin 
205 

Phe Glu Glu Ser Leu 
220 

Lys Phe Ala Thr Asp 
235 

Ala Gly His Arg Leu 
255 

Lys Val Asn Thr Asn 
270 

Phe Glu Glu Leu Arg 
285 

Ser Leu Gin Glu Asn 
30C 

Asp He Ala Ser Thr 
315 

Ala Ser Leu Gin Tyr 
335 

Ser Glu Asp Thr Ser 
350 



Lys Leu Tyr Lys Met 
365 

Lys Phe Phe Lys Val 
380 

Ala Val Phe Lys He 
395 

Asp Gly Phe Asn Leu 
415 

Gin Asn Thr Glu He 
430 

Thr Gly Leu Phe Glu 
445 

Thr Ser Lys Thr Lys 
460 
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Arg 

Ser 

Val 
160 

lie 
Gly 
Tyr 
Glu 

Pro 

240 

Tyr 

Ala 

Thr 

Glu 

Leu 
320 

Met 

Gly 

Leu 

Leu 

Asn 
400 

Arg 
Asn 
Phe 
Ser 
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Leu Asp Lys Gly Tyr Asn Lys Ala Leu Asn Asp Leu Cys He Lvs Val 
46S 470 475 " 480 

Asn Asn Trp Asp Leu Phe Phe Ser Pro Ser Glu Asd Asn Phe Thr Asn 
485 490 " 495 

Asp Leu Asn Lys Gly Glu Glu He Thr Ser Asp Thr Asn He Glu Ala 
500 505 sio 

Ala Glu Glu Asn He Ser Leu Asp Leu He Gin Gin Tyr Tyr Leu Thr 
515 520 525 

Phe Asn Phe Asp Asn Glu Pro Glu Asn He Ser He Glu Asn Leu Ser 
530 535 540 

Ser Asp lie He Gly Gin Leu Glu Leu Met Pro Asn He Glu Arq Phe 
545 550 555 3 560 

Pro Asn Gly Lys Lys Tyr Glu Leu Asp Lys Tyr Thr Mec Phe His Tyr 
565 570 575 

Leu Arg Ala Gin Glu Phe Glu His Gly Lys Ser Arg He Ala Leu Thr 
580 585 590 

Asn Ser Val Asn Glu Ala Leu Leu Asn Pro Ser Arg Val Tyr Thr Phe 
595 600 60S 

Phe Ser Ser Asp Tyr Val Lys Lys Val Asn Lys Ala Thr Glu Ala Ala 
610 615 620 

Met Phe Leu Gly Trp Val Glu Gin Leu Val Tyr Asp Phe Thr Asp Glu 
625 630 635 640 

Thr Ser Glu Val Ser Thr Thr Asp Lys He Ala Asp He Thr lie lie 
645 650 655 

lie Pro Tyr lie Gly Pro Ala Leu Asn lie Gly Asn Met Leu Tyr Lys 
660 665 670 

Asp Asp Phe Val Gly Ala Leu lie Phe Ser Gly Ala Val lie Leu Leu 

U O J 

Glu Phe lie Pro Glu lie Ala lie Pro Val Leu Glv Thr Phe Ala Leu 
690 695 700 

Val Ser Tyr lie Ala Asn Lys Val Leu Thr Val Gin Thr lie Asp Asn 
705 710 715 720 

Ala Leu Ser Lys Arg Asn Glu Lys Trp Asp Glu Val Tyr Lys Tyr lie 
725 730 735 

Val Thr Asn Trp Leu Ala Lys Val Asn Thr Gin lie Asp Leu He Arg 
740 745 750 

Lys Lys Met Lys Glu Ala Leu Glu Asn Gin Ala Glu Ala Thr Lys Ala 
755 760 765 

lie lie Asn Tyr Gin Tyr Asn Gin Tyr Thr Glu Glu Glu Lys Asn Asn 
770 775 780 

lie Asn Phe Asn lie Asp Asp Leu Ser Ser Lys Leu Asn Glu Ser lie 
785 790 795 600 

Asn Lys Ala Met lie Asn He Asn Lys Phe Leu Asn Gin Cys Ser Val 
80S 810 . 815 
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Ser Tyr Leu Met Asn Ser Met lie Pro Tyr Glv Val Lys Arg Leu Glu 
820 825 830 

Asp Phe Asp Ala Ser Leu Lys Asp Ala Leu Leu Lys Tyr He Tv- Asp 
835 840 845 " * 

Asn Arg Gly Thr Leu He Gly Gin Val Aso Arg Leu Lvs Asp Lvs Val 
850 855 860 

Asn Asn Thr Leu Ser Thr Asp He Pro Phe Gin Leu Ser Lys Tvr Val 
865 870 875 ' 8 8C 

Asp Asn Gin Arg Leu Leu Ser Thr . Phe Thr Glu Tyr lie Lys * 
8 85 8 90 895 

(2) INFORMATION FOR SEQ ID NO : 5: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2622 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

<ix) FEATURE: 

(A) NAME /KEY : CDS 

(B) LOCATION: 1 . .2622 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 5: 

GGA TCC ATG GAG TTC GTG AAC AAG CAG TTC AAC TAT AAG GAC CCT GTA 4 8 

Gly Ser Met Glu Phe Val Asn Lys Gin Phe Asn Tvr Lys Asp Pro Val 

1 5 10 " 15 

AAC GGT GTT GAC ATT GCC TAC ATC AAA ATT CCA AAG TAC GGC CAG ATG 96 

Asn Gly Val Asp He Ala Tyr He Lys He Pro Lys Tyr Gly Gin Met 
20 25 30 

CAG CCG GTG AAG GCT TTC AAG ATT CAT AAC AAA ATC TGG GTT ATT CCG 14 4 

Gin Pro Val Lys Ala Phe Lys He His Asn Lys He Trp Val He Pro 
35 40 45 

GAA CGC GAT ACA TTT ACG AAC CCG GAA GAA GGA GAC TTG AAC CCG CCG 192 

Glu Arg Asp Thr Phe Thr Asn Pro Glu Glu Gly Asp Leu Asn Pro Pro 
50 55 60 

CCG GAA GCA AAG CAG GTG CCA GTT TCA TAC TAC GAT TCA ACC TAT CTG 24 0 

Pro Glu Ala Lys Gin Val Pro Val Ser Tyr Tyr Asp Ser Thr Tyr Leu 
65 70 75 80 

AGC ACA GAC AAC GAG AAG GAT AAC TAC CTG AAG GGA GTG ACC AAA TTA 2 88 

Ser Thr Asp Asn Glu Lys Asp Asn Tyr Leu Lys Gly Val Thr Lys Leu 

85 90 95 

TTC GAG CGT ATT TAT TCC ACT GAC CTG GGC CGT ATG CTG CTG ACC TCA 3 36 

Phe Glu Arg He Tyr Ser Thr Asp Leu Gly Arg Met Leu Leu Thr Ser 
100 105 110 

ATC GTC CGC GGA ATC CCA TTT TGG GGT GGC AGT ACC ATT GAC ACG GAG 3 84 

He Val Arg Gly He Pro Phe Trp Gly Gly Ser Thr He Asp Thr Glu 
115 120 125 

TTG AAG GTT ATT GAC ACT AAC TGC ATT AAC GTG ATC CAA CCA GAC GGT 4 32 

Leu Lys Val He Asp Thr Asn Cys He Asn Val He Gin Pro Asd Gly 
130 135 140 
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AGC TAC AGA TCT GAA GAA CTT AAC CTC GTA ATC ATC GGG CCC TCC GCG 
Ser Tyr Arg Ser Glu Glu Leu Asn Leu Val He lie Gly Pro Ser Ala 
145 150 155 160 

GAC ATT ATC CAG TTT GAG TGC AAG AGC TTT GGC CAC GAA GTG TTG AAC 
Asp He He Gin Phe Glu Cys Lys Ser Phe Gly His Glu Val Leu Asn 
165 170 175 

CTG ACG CGT AAC GGT TAC GGC TCT ACT CAG TAC ATT CGT TTC AGC CCA 
Leu Thr Arg Asn Gly Tyr Gly Ser Thr Gin Tyr He Arg Phe Ser Pro 
180 185 190 

GAC TTC ACG TTC GGT TTC GAG GAG AGC ■ CTG GAG GTT GAT ACC AAC CCG 
Asp Phe Thr Phe Gly Phe Glu Glu Ser Leu Glu Val Asp Thr Asn Pro 
195 200 20*5 

CTG TTG GGT GCA GGC AAG TTC GCA ACT GAT CCA GCG GTG ACC CTG GCA 
Leu Leu Gly Ala Gly Lys Phe Ala Thr Asp Pro Ala Val Thr Leu Ala 
210 215 220 



480 



528 



576 



624 



672 



CAC GAG CTG ATC CAC GCC GGT CAT CGT CTG TAT GGC ATT GCG ATT AAC 72 0 

His Glu Leu He His Ala Gly His Arg Leu Tyr Glv He Ala He Asn 
225 230 235 * 2 40 

CCG AAC CGC GTG TTC AAG GTT AAC ACC AAC GCC TAC TAC GAG ATG AGT 768 
Pro Asn Arg Val Phe Lys Val Asn Thr Asn Ala Tyr Tyr Glu Met Ser 
245 250 255 

GGT TTA GAA GTA AGC TTC GAG GAA CTG CGC ACG TTC GGT GGC CAT GAT 816 
Gly Leu Glu Val Ser Phe Glu Glu Leu Arg Thr Phe Gly Gly His Asp 
260 265 270 

GCG AAG TTT ATC GAC AGC TTG CAG GAG AAC GAG TTC CGT CTG TAC TAC 864 
Ala Lys Phe He Asp Ser Leu Gin Glu Asn Glu Phe Arq Leu Tvr Tvr 
275 280 285 



TAC AAC AAG TTT AAA GAT ATT GCA AGT ACA CTG AAC AAG GCT AAG TCC 
Tyr Asn Lys Phe Lys Asp He Ala Ser Thr Leu Asn Lys Ala Lys Ser 
290 295 300 



912 



ATT GTG GGT ACC ACT GCT TCA TTA CAG TAT ATG AAA AAT CTT TTT aaa 
lie vai uiy Tnr Tnr Ala ser Leu Gin Tyr Met Lys Asn Val Phe Lys 
305 310 315 320 

GAG AAA TAT CTC CTA TCT GAA GAT ACA TCT GGA AAA TTT TCG GTA GAT 
Glu Lys Tyr Leu Leu Ser Glu Asp Thr Ser Gly Lys Phe Ser Val Asp 
325 330 335 



1008 



AAA TTA AAA TTT GAT AAG TTA TAC AAA ATG TTA ACA GAG ATT TAC ACA 
Lys Leu Lys Phe Asp Lys Leu Tyr Lys Met Leu Thr Glu He Tyr Thr 
340 345 350 



1056 



GAG GAT AAT TTT GTT AAG TTT TTT AAA GTA CTT AAC AGA AAA ACA TAT 
Glu Asp Asn Phe Val Lys Phe Phe Lys Val Leu Asn Arg Lys Thr Tvr 
355 360 365 



1104 



TTG AAT TTT GAT AAA GCC GTA TTT AAG ATA AAT ATA GTA CCT AAG GTA 
Leu Asn Phe Asp Lys Ala Val Phe Lys lie Asn He Val Pro Lys Val 
370 375 380 



1152 



AAT TAC ACA ATA TAT GAT GGA TTT AAT TTA AGA AAT ACA AAT TTA GCA 
Asn Tyr Thr He Tyr Asp Gly Phe Asn Leu Arg Asn Thr Asn Leu Ala 

-3-8-5 35-0 3-9S 4-0 Q- 



1200 



GCA AAC TTT AAT GGT CAA AAT ACA GAA ATT AAT AAT ATG AAT TTT ACT 
Ala Asn Phe Asn Gly Gin Asn Thr Glu He Asn Asn Met Asn Phe Thr 
405 410 415 



1248 
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AAA CTA AAA AAT TTT ACT GGA TTG TTT GAA TTT TAT AAG TTG CTA TGT 12 96 

Lys Leu Lys Asn Phe Thr Gly Leu Phe Glu Phe Tyr Lys Leu Leu Cys 

420 425 430 

GTA AGA GGG ATA ATA ACT TCT AAA ACT AAA TCA TTA GAT AAA GGA TAC 134 4 

Val Arg Gly He He Thr Ser Lys Thr Lys Ser Leu Asp Lys Gly Tyr 

a-ic: 440 445 



AAT AAG GCA TTA AAT GAT TTA TGT ATC AAA GTT AAT AAT TGG GAC TTG 13 92 

Asn Lvs Ala Leu Asn Asp Leu Cys He Lys Val Asn Asn Trp Asp Leu 
* % * Ac*** 4fin 



450 455 460 



TTT TTT AGT CCT TCA GAA GAT AAT TTT ACT AAT GAT CTA AAT AAA GGA. 144 0 

Phe Phe Ser Pro Ser Glu Asp Asn Phe Thr Asn Asp Leu Asn Lys Gly 
465 470 475 480 

GAA GAA ATT ACA TCT GAT ACT AAT ATA GAA GCA GCA GAA GAA AAT ATT 14 8 8 

Glu Glu He Thr Ser Asp Thr Asn He Glu Ala Ala Glu Glu Asn He 
485 490 495 

AGT TTA GAT TTA ATA GAA CAA TAT TAT TTA ACC TTT AAT TTT GAT AAT 15 36 

Ser Leu Asp Leu He Gin Gin Tyr Tyr Leu Thr Phe Asn Phe Asp Asn 
500 505 510 

GAA CCT GAA AAT ATT TCA ATA GAA AAT CTT TCA AGT GAC ATT ATA GGC 15 84 

Glu Pro Glu Asn He Ser He Glu Asn Leu Ser Ser Asp lie He Gly 
515 520 525 

CAA TTA GAA CTT ATG CCT AAT ATA GAA AGA TTT CCT AAT GGA AAA AAG 16 3 2 

Gin Leu Glu Leu Met Pro Asn lie Glu Arg Phe Pro Asn Gly Lys Lys 
530 535 540 

TAT GAG TTA GAT AAA TAT ACT ATG TTC CAT TAT CTT CGT GCT CAA GAA 
Tvr Glu Leu Asp Lys Tyr Thr Met Phe His Tyr Leu Arg Ala Gin Glu 
■ 550 555 560 



545 



1680 



1728 



TTT GAA CAT GGT AAA TCT AGG ATT GCT TTA ACA AAT TCT GTT AAC GAA 
Phe Glu His Gly Lys Ser Arg He Ala Leu Thr Asn Ser Val Asn Glu 
565 570 575 

GCA TTA TTA AAT CCT AGT CGT GTT TAT ACA TTT TTT TCT TCA GAC TAT 177 6 

Ala Leu Leu Asn Pro Ser Arg Val Tyr Thr Phe Phe Ser Ser Asp Tyr 
580 585 590 

GTA AAG AAA GTT AAT AAA GCT ACG GAG GCA GCT ATG TTT TTA GGC TGG 182 4 

Val Lys Lys Val Asn Lys Ala Thr Glu Ala Ala Met Phe Leu Gly Trp 
595 600 605 

GTA GAA CAA TTA GTA TAT GAT TTT ACC GAT GAA ACT AGC GAA GTA AGT 1872 
Val Glu Gin Leu Val Tyr Asp Phe Thr Asp Glu Thr Ser Glu Val Ser 
610 615 620 

ACT ACG GAT AAA ATT GCG GAT ATA ACT ATA ATT ATT CCA TAT ATA GGA 192 0 

Thr Thr Asp Lys He Ala Asp He Thr lie lie lie Pro Tyr He Gly 
625 630 635 640 

CCT GCT TTA AAT ATA GGT AAT ATG TTA TAT AAA GAT GAT TTT GTA GGT 1968 
Pro Ala Leu Asn He Gly Asn Met Leu Tyr Lys Asp Asp Phe Val Gly 
645 650 655 

GCT TTA ATA TTT TCA GGA GCT GTT ATT CTG TTA GAA TTT ATA CCA GAG 2 016 

Ala Leu He Phe Ser Gly Ala Val He Leu Leu Glu Phe He Pro Glu 
660 665 670 

ATT GCA ATA CCT GTA TTA GGT ACT TTT GCA CTT GTA TCA TAT ATT GCG 2 064 

He Ala He Pro Val Leu Gly Thr Phe Ala Leu Val Ser Tyr lie Ala 
675 680 685 
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AAT AAG GTT CTA ACC GTT CAA ACA ATA GAT AAT GCT TTA AGT AAA AGA 211*> 
Asn Lys Val Leu Thr Val Gin Thr He Asp Asn Ala Leu Ser Lvs Ara 
690 695 700 

AAT GAA AAA TGG GAT GAG GTC TAT AAA TAT ATA GTA ACA AAT TGG TTA 2160 
Asn Glu Lys Trp Asp Glu Val Tyr Lys Tyr He Val Thr Asn Trp Leu 
705 710 715 720 

GCA AAG GTT AAT ACA CAG ATT GAT CTA ATA AGA AAA AAA ATG AAA GAA 22 0 8 

Ala Lys Val Asn Thr Gin He Asp Leu He Arg Lys Lys Met Lys Glu 
725 730 735 

GCT TTA GAA AAT CAA GCA GAA GCA ACA AAG GCT ATA ATA AAC TAT CAG 2^56 
Ala Leu Glu Asn Gin Ala Glu Ala Thr Lys Ala He He Asn Tyr Gin 
740 745 750 

TAT AAT CAA TAT ACT GAG GAA GAG AAA AAT AAT ATT AAT TTT AAT ATT 2 304 

Tyr Asn Gin Tyr Thr Glu Glu Glu Lys Asn Asn He Asn Phe Asn He 
755 760 765 

GAT GAT TTA AGT TCG AAA CTT AAT GAG TCT ATA AAT AAA GCT ATG ATT 2 3 52 

Asp Asp Leu Ser Ser Lys Leu Asn Glu Ser He Asn Lys Ala Met He 
770 775 780 

AAT ATA AAT AAA TTT TTG AAT CAA TGC TCT GTT TCA TAT TTA ATG AAT 24 00 

Asn He Asn Lys Phe Leu Asn Gin Cys Ser Val Ser Tvr Leu Met Asn 
785 790 795 " 800 

TCT ATG ATC CCT TAT GGT GTT AAA CGG TTA GAA GAT TTT GAT GCT AGT 244 8 

Ser Met He Pro Tyr Gly Val Lys Arg Leu Glu Asp Phe Asp Ala Ser 
805 810 815 

CTT AAA GAT GCA TTA TTA AAG TAT ATA TAT GAT AAT AGA GGA ACT TTA 24 96 

Leu Lys Asp Ala Leu Leu Lys Tyr He Tyr Asp Asn Arg Gly Thr Leu 
820 825 830 

ATT GGT CAA GTA GAT AGA TTA AAA GAT AAA GTT AAT AAT ACA CTT AGT 2544 
He Gly Gin Val Asp Arg Leu Lys Asp Lys Val Asn Asn Thr Leu Ser 
835 840 845 

am naT a?a ppt ttt ran ctt Trr n** r» r-w* * -* m « m^-« 

Thr Asp lie Pro Phe Gin Leu Ser Lys Tyr Val Asp Asn Gin Arg Leu 
850 855 860 



TTA TCT ACA TTT ACT GAA TAT ATT AAG TAA 2 622 

Leu Ser Thr Phe Thr Glu Tyr He Lys * 
865 870 

(2) INFORMATION FOR SEQ ID NO: 6: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 874 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 



Gly Ser Met Glu Phe Val Asn Lys Gin Phe Asn Tyr Lys Asp Pro Val 
15 10 15 

-Asir-Gxy-Vai^s^Iire-^3~Tyr-^ Iire~Lys~-Il-e~"Pro™i;ys~Tyr Gly~GTrr~Met' 
20 25 30 

Gin Pro Val Lys Ala Phe Lys He His Asn Lys lie Trp Val He Pro 
35 40 45 
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Glu Arg Asp Thr Phe Thr Asn Pro Glu Glu Gly Asp Leu Asn Pro Pro 
50 55 60 

Pro Glu Ala Lys Gin Val Pro Val Ser Tyr Tyr Asp Ser Thr Tyr Leu 
65 70 75 80 

Ser Thr Asp Asn Glu Lys Asp Asn Tyr Leu Lys Gly Val Thr Lys Leu 
85 * 90 95 

Phe Glu Arg lie Tyr Ser Thr Asp Leu Gly Arg Met Leu Leu Thr Ser 
100 105 110 

He Val Arg Gly lie Pro Phe Trp Gly Gly Ser Thr He Asp Thr Glu 
115 120 125 

Leu Lvs Val He Asp Thr Asn Cys He Asn Val He Gin Pro Asp Gly 
130 135 140 

Ser Tvr Arg Ser Glu Glu Leu Asn Leu Val He He Gly Pro Ser Ala 
145 ' 150 155 160 

Asp He He Gin Phe Glu Cys Lys Ser Phe Gly His Glu Val Leu Asn 
165 170 175 

Leu Thr Arg Asn Gly Tyr Gly Ser Thr Gin Tyr He Arg Phe Ser Pro 
180 185 190 

Asp Phe Thr Phe Gly Phe Glu Glu Ser Leu Glu Val Asp Thr Asn Pro 
195 200 205 

Leu Leu Gly Ala Gly Lys Phe Ala Thr Asp Pro Ala Val Thr Leu Ala 
210 215 220 

His Glu Leu He His Ala Gly His Arg Leu Tyr Gly He Ala He Asn 
225 230 235 240 

Pro Asn Arg Val Phe Lys Val Asn Thr Asn Ala Tyr Tyr Glu Met Ser 
245 250 255 

Gly Leu Glu Val Ser Phe Glu Glu Leu Arg Thr Phe Gly Gly His Asp 
260 265 270 

Ala Lys Phe He Asp Ser Leu Gin Glu Asn Glu Phe Arg Leu Tyr Tyr 
275 . 280 285 

Tyr Asn Lys Phe Lys Asp He Ala Ser Thr Leu Asn Lys Ala Lys Ser 
290 295 300 

He Val Gly Thr Thr Ala Ser Leu Gin Tyr Met Lys Asn Val Phe Lys 
305 310 315 320 

Glu Lvs Tvr Leu Leu Ser Glu Asp Thr Ser Gly Lys -Phe Ser Val Asp 
y Y 325 330 335 

Lvs Leu Lys Phe Asp Lys Leu Tyr Lys Met Leu Thr Glu He Tyr Thr 
340 345 350 

Glu Asp Asn Phe Val Lys Phe Phe Lys Val Leu Asn Arg Lys Thr Tyr 
355 360 365 

Leu Asn Phe Asp Lys Ala Val Phe Lys He Asn He Val Pro Lys Val 
370 375 380 

Asn Tvr Thr He Tyr Asp Gly Phe Asn Leu Arg Asn Thr Asn Leu Ala 
385 390 395 400 
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Ala Asn Phe Asn Gly Gin Asn Thr Glu lie Asn Asn Met Asn Phe Thr 
4 °5 410 415 

Lys Leu Lys Asn Phe Thr Gly Leu Phe Glu Phe Tyr Lys Leu Leu Cys 
420 425 430 

Val Arg Gly lie lie Thr Ser Lys Thr Lys Ser Leu Asp Lys Gly Tvr 
435 440 44 5 J y y 

Asn Lys Ala Leu Asn Asp Leu Cys lie Lys Val Asn Asn Trp Asp Leu 
Phe Phe Ser Pro Ser Glu Asp Asn Phe Thr Asn Asd Leu Asn Lys Glv 

4«s 470 475 - y 

Glu Glu lie Thr Ser Asp Thr Asn He Glu Ala Ala Glu Glu Asn He 
485 490 495 

Ser Leu Asp Leu He Gin Gin Tyr Tyr Leu Thr Phe Asn Phe Asp Asn 
500 505 510 

Glu Pro Glu Asn lie Ser lie Glu Asn Leu Ser Ser Asp lie lie Gly 

515 520 525 

Gin Leu Glu Leu Met Pro Asn lie Glu Arg Phe Pro Asn Gly Lys Lys 

Tyr Glu Leu Asp Lys Tyr Thr Met Phe His Tyr Leu Arg Ala Gin Glu 
54S 550 555 560 

Phe Glu His Gly Lys Ser Arg He Ala Leu Thr Asn Ser Val Asn Glu 
565 570 57s 

Ala Leu Leu Asn Pro Ser Arg Val Tyr Thr Phe Phe Ser Ser Asn Tvr 
S8 ° 585 590 y 

Val Lys Lys Val Asn Lys Ala Thr Glu Ala Ala Met Phe Leu Glv Tro 
59 5 600 6 05 

Val Glu Gin T.^n VpI t^y- ncn dk 0 * ^i.. m i_ - . 

610 615 * 620 ^ 

Thr Thr Asp Lys lie Ala Asp lie Thr lie He He Pro Tyr He Gly 
625 630 635 6 4o 

Pro Ala Leu Asn He Gly Asn Met Leu Tyr Lys Asp Asp Phe Val Glv 
645 650 655 x 

Ala Leu He Phe Ser Gly Ala Val lie Leu Leu Glu Phe He Pro Glu 
660 665 670 

lie Ala He Pro Val Leu Gly Thr Phe Ala Leu Val Ser Tyr He Ala 
675 680 685 

Asn Lys Val Leu Thr Val Gin Thr lie Asp Asn Ala Leu Ser Lys Ara 
690 695 700 

Asn Glu Lys Trp Asp Glu Val Tyr Lys Tyr lie Val Thr Asn Trp Leu 
705 710 7 15 * 720 

Ala Lys Val Asn Thr Gin He Asp Leu lie Arg Lys Lys Met Lys Glu 
725 730 735 

Ala Leu Glu Asn Gin Ala Glu Ala Thr Lys Ala He lie Asn Tyr Gin 
7 *0 745 750' 
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Lys Asn Asn lie Asn Phe Asn lie 
765 

Glu Ser lie Asn Lys Ala Met lie 
780 

Cys Ser Val Ser Tyr Leu Met Asn 

795 800 

Arg Leu Glu Asp Phe Asp Ala Ser 
810 * 815 

He Tyr Asp Asn Arg Gly Thr Leu 
825 830 

Asp Lys Val Asn Asn Thr Leu Ser 
845 

Lys Tyr Val Asp Asn Gin Arg Leu 
860 

Lys * 

865 8/0 
(2) INFORMATION FOR SEQ ID NO: 7: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2613 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double. 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE : DNA (genomic) 

(ix) FEATURE: 

(A) NAME/ KEY: CDS 

(B) LOCATION : 1 . .2613 



Tyr Asn 

Asp Asp 
770 

Asn lie 
785 

Ser Met 

Leu Lys 

He Gly 

Thr Asp 
850 

Leu Ser 



Gin Tyr 
755 

Leu Ser 
Asn Lys 
He Pro 



Asp. Ala 
820 

Gin Val 
835 

He Pro 



Thr Phe 



Thr Glu 

Ser Lys 

Phe Leu 
790 

Tyr Gly 
805 

Leu Leu 
Asp Arg 
Phe Gin 
Thr Glu 



Glu Glu 
760 

Leu Asn 
775 

Asn Gin 

Val Lys 

Lys Tyr 

Leu Lys 
840 

Leu Ser 
855 

Tyr He 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 7: 

ATG CCA TTT GTT AAT AAA CAA TTT AAT TAT AAA GAT CCT GTA AAT GGT 4 8 

Met Pro Phe Val Asn Lys Gin Phe Asn Tyr Lys Asp Pro Val Asn Gly 
1 5 10 15 

GTT GAT ATT GCT TAT ATA AAA ATT CCA AAT GCA GGA CAA ATG CAA CCA 9 6 

Val Aso He Ala Tyr He Lys He Pro Asn Ala Gly Gin Met Gin Pro 
20 25 30 

GTA AAA GCT TTT AAA ATT CAT AAT AAA ATA TGG GTT ATT CCA GAA AGA 14 4 

Val Lys Ala Phe Lys He His Asn Lys He Trp Val He Pro Glu Arg 
35 40 45 

GAT ACA TTT ACA AAT CCT GAA GAA GGA GAT TTA AAT CCA CCA CCA GAA 192 
Asp Thr Phe Thr Asn Pro Glu Glu Gly Asp Leu Asn Pro Pro Pro Glu 
50 55 60 

GCA AAA CAA GTT CCA GTT TCA TAT TAT GAT TCA ACA TAT TTA AGT ACA 24 0 

Ala Lys Gin Val Pro Val Ser Tyr Tyr Asp Ser Thr Tyr Leu Ser Thr 
65 70 75 80 



288 
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AGA ATT TAT TCA ACT GAT CTT GGA AGA ATG TTG TTA ACA TCA ATA GTA 
Arg lie Tyr Ser Thr Asp Leu Gly Arg Met Leu Leu Thr Ser He Val 
100 105 no 



33c 



AGG GGA ATA CCA TTT TGG GGT GGA AGT ACA ATA GAT ACA GAA TTA AAA 
Arg Gly He Pro Phe Trp Gly Gly Ser Thr He Asp Thr Glu Leu Lys 
115 120 125 



384 



GTT ATT GAT ACT AAT TGT ATT AAT GTG ATA CAA CCA GAT GGT AGT TAT 
Val He Asp Thr Asn Cys He Asn Val lie Gin Pro Asp Gly Ser Tyr 
130 135 140 



432 



AGA TCA GAA GAA CTT AAT CTA GTA ATA ATA GGA CCC TCA GCT GAT ATT 
Arg Ser Glu Glu Leu Asn Leu Val He He Gly Pro Ser Ala Asp He 
145 150 155 ifio 



480 



ATA CAG TTT GAA TGT AAA AGC TTT GGA CAT GAA GTT TTG AAT CTT ACG 
He Gin Phe Glu Cys Lys Ser Phe Gly His Glu Val Leu Asn Leu Thr 
165 170 175 

CGA AAT GGT TAT GGC TCT ACT CAA TAC ATT AGA TTT AGC CCA GAT TTT 
Arg Asn Gly Tyr Gly Ser Thr Gin Tyr He Arg Phe Ser Pro Asp Phe 
180 ■ 185 190 



528 



576 



ACA TTT GGT TTT GAG GAG TCA CTT GAA GTT GAT ACA AAT CCT CTT TTA 
Thr Phe Gly Phe Glu Glu Ser Leu Glu Val Asp Thr Asn Pro Leu Leu 
195 200 205 



624 



GGT GCA GGC AAA TTT GCT ACA GAT CCA GCA GTA ACA TTA GCA CAT GAA 
Gly Ala Gly Lys Phe Ala Thr Asp Pro Ala Val Thr Leu Ala His Glu 
210 215 220 



672 



CTT ATA CAT GCT GGA CAT AGA TTA TAT GGA ATA GCA ATT AAT CCA AAT 
Leu He His Ala Gly His Arg Leu Tyr Gly He Ala He Asn Pro Asn 
225 230 235 240 



720 



AGG GTT TTT AAA GTA AAT ACT AAT GCC TAT TAT GAA ATG AGT GGG TTA 
Arg Val Phe Lys Val Asn Thr Asn Ala Tyr Tyr Glu Met Ser Gly Leu 
245 250 255 



768 



Glu Val Ser 



Phe 
260 



Glu Glu Leu 



Arg Thr 
265 



Phe Gly 



Gly His Asp 
270 



Ala Lys 



TTT ATA GAT 
Phe He Asp 
275 



AGT 
Ser 



TTA CAG 
Leu Gin 



GAA 
Glu 



AAC GAA 
Asn Glu 
280 



TTT CGT 
Phe Arg 



CTA TAT TAT 
Leu Tyr Tyr 
285 



TAT AAT 
Tyr Asn 



864 



AAG TTT AAA 
Lys Phe Lys 
290 



GAT 
Asp 



ATA GCA 
He Ala 



AGT 
Ser 
295 



ACA CTT 
Thr Leu 



AAT AAA 
Asn Lys 



GCT AAA TCA 
Ala Lys Ser 
300 



ATA GTA 
He Val 



912 



GGT ACT ACT 
Gly Thr Thr 
305 



GCT 
Ala 



TCA TTA 
Ser Leu 
310 



CAG 
Gin 



TAT ATG 
Tyr Met 



AAA AAT 
Lys Asn 
315 



GTT TTT AAA 
Val Phe Lys 



GAG AAA 
Glu Lys 
320 



960 



TAT CTC CTA 
Tyr Leu Leu 



TCT 
Ser 



GAA GAT 
Glu Asp 
325 



ACA TCT GGA 
Thr Ser Gly 



AAA TTT 
Lys Phe 
330 



TCG GTA GAT 
Ser Val Asp 



AAA TTA 
Lys Leu 
335 



1008 



AAA TTT GAT AAG 
Lys Phe Asp Lys 

3'4"0" 



TTA TAC 
Leu Tyr 



AAA ATG TTA 
Lys Met Leu 

3'4'5- 



ACA GAG 
Thr Glu 



ATT TAC ACA 
He Tyr Thr 
3TCT 



GAG GAT 
Glu Asp 



1056 



AAT TTT GTT AAG 
Asn Phe Val Lys 
355 



TTT TTT 
Phe Phe 



AAA GTA CTT 
Lys Val Leu 
360 



AAC AGA 
Asn Arg 



AAA ACA TAT 
Lys Thr Tyr 
365 



TTG AAT 
Leu Asn 



1104 
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TTT 
Phe 


GAT 
Asp 
370 


AAA 
Lys 


GCC 
Ala 


GTA 

Val 


TTT 
Phe 


AAG 
Lys 
375 


ATA 

He 


AAT 
Asn 


ATA 
He 


GTA 
Val 


CCT 
Pro 
380 


AAG 
Lys 


GTA 
Val 


AAT 
Asn 


TAC 
Tyr 


1152 


ACA 
Thr 
385 


ATA 
He 


TAT 
Tyr 


GAT 
Asp 


GGA 
Gly 


TTT 
Phe 
390 


AAT 
Asn 


TTA 
Leu 


AGA 
Arg 


AAT 
Asn 


ACA 
Thr 
395 


AAT 
Asn 


TTA 
Leu 


GCA 
Ala 


GCA 
Ala 


AAC 
Asn 
400 


1200 


TTT 
Phe 


AAT 
Asn 


GGT 
Gly 


CAA 
Gin 


AAT 
Asn 
405 


ACA 
Thr 


GAA 
Glu 


ATT 

He 


AAT 
Asn 


AAT 
Asn 
410 


ATG 
Met 


AAT 
Asn 


Phe 


ACT 
Thr 


AAA 
Lys 
415 


CTA 
Leu 


1248 


AAA 
Lys 


AAT 
Asn 


TTT 
Phe 


ACT 
Thr 
420 


GGA 
Gly 


TTG 

Leu 


TTT 
Phe 


GAA 
Glu 


TTT 
Phe 
425 


TAT 

Tyr 


AAG 
Lys 


TTG 
Leu 


CTA 
Leu 


TGT 
Cys 
430 


GTA 
Val 


AGA 
Arg 


" 1296 


GGG 
Gly 


ATA 

lie 


ATA 
He 
435 


ACT 
Thr 


TCT 
Ser 


AAA 
Lys 


ACT 
Thr 


AAA 
Lys 
440 


TCA 
Ser 


TTA 
Leu 


GAT 
Asp 


AAA 
Lys 


GGA 
Gly 
445 


TAC 
Tyr 


AAT 
Asn 


AAG 
Lys 


1344 


GCA 
Ala 


TTA 
Leu 
450 


AAT 
Asn 


GAT 
Asp 


TTA 
Leu 


TGT 
Cys 


ATC 
He 
455 


AAA 
Lys 


GTT 
Val 


AAT 
Asn 


AAT 
Asn 


TGG 
460 


GAC 
Asp 


TTG 
Leu 


TTT 
Phe 


TTT 
Phe 


1392 


AGT 
Ser 
465 


CCT 
Pro 


TCA 
Ser 


GAA 
Glu 


GAT 
Asp 


AAT 
Asn 
470 


TTT 
Phe 


ACT 
Thr 


AAT 
Asn 


GAT 
Asp 


CTA 
Leu 
475 


AAT 
Asn 


AAA 
Lys 


GGA 
Gly 


GAA 
Glu 


GAA 
Glu 
480 


1440 


ATT 
lie 


ACA 
Thr 


TCT 
Ser 


GAT 
Asp 


ACT 
Thr 
485 


AAT 
Asn 


ATA 
He 


GAA 
Glu 


GCA 
Ala 


GCA 
Ala 
490 


GAA 
Glu 


GAA 
Glu 


AAT 
Asn 


ATT 
He 


AGT 
Ser 
495 


TTA 
Leu 


1488 


GAT 
Asp 


TTA 
Leu 


ATA 
He 


CAA 
Gin 
500 


CAA 
Gin 


TAT 
Tyr 


TAT 
Tyr 


TTA 
Leu 


ACC 
Thr 
505 


TTT 
Phe 


AAT 
Asn 


TTT 
Phe 


GAT 
Asp 


AAT 
Asn 
510 


GAA 
Glu 


CCT 
Pro 


1536 


GAA 
Glu 


AAT 
Asn 


ATT 
He 
515 


TCA 
Ser 


ATA 
He 


GAA 
Glu 


AAT 
Asn 


CTT 
Leu 
520 


TCA 
Ser 


AGT 
Ser 


GAC 
Asp 


ATT 

He 


ATA 
He 
525 


GGC 
Gly 


CAA 
Gin 


TTA 

Leu 


1584 


GAA 
Glu 


CTT 

Leu 
530 


ATG 
Met 


CCT 
Pro 


AAT 
Asn 


ATA 
He 


GAA 
Glu 
535 


AGA 
Arg 


TTT 
Phe 


CCT 
Pro 


AAT 
Asn 


GGA 
Gly 
540 


AAA 
Lys 


AAG 

Lys 


TAT 
Tyr 


GAG 
Glu 


1632 


TTA 
Leu 
545 


GAT 
Asp 


AAA 
Lys 


TAT 
Tyr 


ACT 
Thr 


ATG 
Met 
550 


TTC 
Phe 


CAT 

His 


TAT 
Tyr 


CTT 
Leu 


CGT 
Arg 
555 


GCT 
Ala 


CAA 
Gin 


GAA 
Glu 


TTT 
Phe 


GAA 
Glu 
560 


1680 


CAT 
His 


GGT 
Gly 


AAA 
Lys 


TCT 
Ser 


AGG 
Arg 
565 


ATT 
He 


GCT 
Ala 


TTA 
Leu 


ACA 
Thr 


AAT 
Asn 
570 


TCT 
Ser 


GTT 
Val 


AAC 
Asn 


GAA 
Glu 


GCA 
Ala 
525 


TTA 
Leu 


1728 


TTA 
Leu 


AAT 
Asn 


CCT 
Pro 


AGT 
Ser 
580 


CGT 
Arg 


GTT 
Val 


TAT 
Tyr 


ACA 
Thr 


TTT 
Phe 
585 


TTT 
Phe 


TCT 
Ser 


TCA 
Ser 


GAC 
Asp 


TAT 
Tyr 
590 


GTA 
Val 


AAG 

Lys 


1776 


AAA 
Lys 


GTT 
Val 


AAT 
Asn 
595 


AAA 
Lys 


GCT 
Ala 


ACG 
Thr 


GAG 
Glu 


GCA 
Ala 
600 


GCT 
Ala 


ATG 
Met 


TTT 
Phe 


TTA 
Leu 


GGC 
Gly 
605 


TGG 
Trp 


GTA 
Val 


GAA 
Glu 


1824 


CAA 
Gin 


TTA 
Leu 
610 


GTA 
Val 


TAT 
Tyr 


GAT 
Asp 


TTT 
Phe 


ACC 
Thr 
615 


GAT 
Asp 


GAA 
Glu 


ACT 
Thr 


AGC 
Ser 


GAA 
Glu 
620 


GTA 
Val 


AGT 
Ser 


ACT 
Thr 


ACG 
Thr 


1872 


GAT 
Asp 
625 


AAA 

Lys 


ATT 

He 


GCG 
Ala 


GAT 
Asp 


ATA 
He 
630 


ACT 
Thr 


ATA 
He 


ATT 
He 


ATT 
He 


CCA 
Pro 
635 


TAT 
Tyr 


ATA 

He 


GGA 
Gly 


CCT 
Pro 


GCT 
Ala 
640 


1920 
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TTA AAT ATA GGT AAT ATG TTA TAT AAA GAT GAT TTT GTA GGT GCT TTA 196fl 
Leu Asn He Gly Asn Met Leu Tyr Lys Asp Asp Phe Val Gly Ala Leu 
645 650 655 

ATA TTT TCA GGA GCT GTT ATT CTG TTA GAA TTT ATA CCA GAG ATT GCA 2 016 

He Phe Ser Gly Ala Val He Leu Leu Glu Phe lie Pro Glu He Ala 
660 665 670 

ATA CCT GTA TTA GGT ACT TTT GCA CTT GTA TCA TAT ATT GCG AAT AAG 2 064 

He Pro Val Leu Gly Thr Phe Ala Leu Val Ser Tyr He Ala Asn Lys 
675 680 685 

GTT CTA ACC GTT CAA ACA ATA GAT AAT GCT TTA AGT AAA AGA AAT GAA 2112 
Val Leu Thr Val Gin Thr He Asp Asn Ala Leu Ser Lys Arg Asn Cu 
690 " 695 700 

AAA TGG GAT GAG GTC TAT AAA TAT ATA GTA ACA AAT TGG TTA GCA AAG 2160 
Lys Trp Asp Glu Val Tyr Lys Tyr lie Val Thr Asn Trp Leu Ala Lys 
705 710 715 720 

GTT AAT ACA CAG ATT GAT CTA ATA AGA AAA AAA ATG AAA GAA GCT TTA 22 08 

Val Asn Thr Gin He Asp Leu lie Arg Lvs Lys Met Lys Glu Ala Leu 
725 730 735 

GAA AAT CAA GCA GAA GCA ACA AAG GCT ATA ATA AAC TAT CAG TAT AAT 2 256 

Glu Asn Gin Ala Glu Ala Thr Lys Ala lie lie Asn Tyr Gin Tyr Asn 
740 745 750 

CAA TAT ACT GAG GAA GAG AAA AAT AAT ATT AAT TTT AAT ATT GAT GAT 2304 
Gin Tyr Thr Glu Glu Glu Lys Asn Asn He Asn Phe Asn lie Asp Asp 
755 760 765 

TTA AGT TCG AAA CTT AAT GAG TCT ATA AAT AAA GCT ATG ATT AAT ATA 23 52 

Leu Ser Ser Lys Leu Asn Glu Ser He Asn Lys Ala Met lie Asn lie 
770 775 780 

AAT AAA TTT TTG AAT CAA TGC TCT GTT TCA TAT TTA ATG AAT TCT ATG 24 00 

Asn Lys Phe Leu Asn Gin Cys Ser Val Ser Tyr Leu Met Asn Ser Met 
785 790 795 800 

ATC CCT TAT GGT GTT AAA CGG TTA GAA GAT TTT GAT GCT AGT CTT AAA 24 4 A 

j. j.c jtau a/a (jiy vcia ij/s Mirg Leu *jj.u Asp ?ne Asp Ala Ser Leu Lys 
805 810 815 

GAT GCA TTA TTA AAG TAT ATA TAT GAT AAT AGA GGA ACT TTA ATT GGT 24 96 

Asp Ala Leu Leu Lys Tyr lie Tyr Asp Asn Arg Gly 'Thr Leu lie Gly 
820 825 830 

CAA GTA GAT AGA TTA AAA GAT AAA GTT AAT AAT ACA CTT AGT ACA GAT 2544 
Gin Val Asp Arg Leu Lys Asp Lys Val Asn Asn Thr Leu Ser Thr Asp 
835 840 845 

ATA CCT TTT CAG CTT TCC AAA TAC GTA GAT AAT CAA AGA TTA TTA TCT 2592 
lie Pro Phe Gin Leu Ser Lys Tyr Val Asp Asn Gin Arg Leu Leu Ser 
850 855 860 

ACA TTT ACT GAA TAT ATT AAG 2613 
Thr Phe Thr Glu Tyr lie Lys 
865 870 

(2) INFORMATION FOR SEQ ID NO : 8 : 

(i) SE QUENCE CHARACT ERISTICS : 



(A) LENGTH: 871 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 

Met Pro Phe Val Asn Lys Gin Phe Asn Tyr Lys Asp Pro Val Asn Gly 
1 5 10 15 

Val Asp He Ala Tyr He Lys He Pro Asn Ala Gly Gin Met Gin Pro 
20 25 30 

Val Lys Ala Phe Lys He His Asn Lys He Trp Val He Pro Glu Arg 
35 40 45 

Asp Thr Phe Thr Asn Pro Glu Glu Gly Asp Leu Asn Pro- Pro Pro Glu 
50 ' 55 60 

Ala Lys Gin Val Pro Val Ser Tyr Tyr Asp Ser Thr Tyr Leu Ser Thr 
65 70 75 80 

Asp Asn Glu Lys Asp Asn Tvr Leu Lys Gly Val Thr Lys Leu Phe Glu 
85 90 95 

Arq He Tyr Ser Thr Asp Leu Gly Arg Met Leu Leu Thr Ser He Val 
100 105 110 

Arq Gly He Pro Phe Trp Gly Gly Ser Thr He Asp Thr Glu Leu Lys 

X20 125 



115 



Val He Asp Thr Asn Cys He Asn Val He Gin Pro Asp Gly Ser Tyr 

135 140 



130 



Arq Ser Glu Glu Leu Asn Leu Val He He Gly Pro Ser Ala Asp He 
14 5 150 155 160 

He Gin Phe Glu Cys Lys Ser Phe Gly His Glu Val Leu Asn Leu Thr 
165 170 175 

Arq Asn Gly Tyr Gly Ser Thr Gin Tyr He Arg Phe Ser Pro Asp Phe 
180 185 190 

Thr Phe Gly Phe Glu Glu Ser Leu Glu Val Asp Thr Asn Pro Leu Leu 
195 200 205 

Gly Ala Gly Lys Phe Ala Thr Asp Pro Ala Val Thr Leu Ala His Glu 
210 ' 215 220 

Leu He His Ala Gly His Arg Leu Tyr Gly He Ala He Asn Pro Asn 
225 230 235 240 

Arg Val Phe Lys Val Asn Thr Asn Ala Tyr Tyr Glu Met Ser Gly Leu 
245 250 255 

Glu Val Ser Phe Glu Glu Leu Arg Thr Phe Gly Gly His Asp Ala Lys 
260 265 270 

Phe He Asp Ser Leu Gin Glu Asn Glu Phe Arg Leu Tyr Tyr Tyr Asn 
275 280 285 

Lvs Phe Lys Asp He Ala Ser Thr Leu Asn Lys Ala Lys Ser He Val 
y 290 295 300 

Glv Thr Thr Ala Ser Leu Gin Tyr Met Lys Asn Val Phe Lys Glu Lys 
305 310 315 320 

Tvr Leu Leu Ser Glu Asp Thr Ser Gly Lys Phe Ser Val Asp Lys Leu 
325 330 .335 
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Lys Phe Asp Lys Leu Tyr Lys Met Leu Thr Glu He Tyr Thr Glu Asp 
340 345 350 

Asn Phe Val Lys Phe Phe Lys Val Leu Asn Arg Lvs Thr Tyr Leu Asn 
355 360 " 365 

Phe Asp Lys Ala Val Phe Lys He Asn He Val Pro Lys Val Asn Tyr 
370 375 380 

Thr He Tyr Asp Gly Phe Asn Leu Arg Asn Thr Asn Leu Ala Ala Asn 
385 390 395 40 0 

Phe Asn Gly Gin Asn Thr Glu lie Asn Asn Met Asn Phe Thr Lys Leu 
405 410 4is 

Lys Asn Phe Thr Gly Leu Phe Glu Phe Tyr Lys Leu Leu Cys Val Arg 
420 425 430 

Gly He He Thr Ser Lys Thr Lys Ser Leu Asp Lys Gly Tyr Asn Lys 
435 440 * 445 

Ala Leu Asn Asp Leu Cys He Lys Val Asn Asn Trp Asd Leu Phe Phe 
450 455 460 

Ser Pro Ser Glu Asp Asn Phe Thr Asn Asp Leu Asn Lys Gly Glu Glu 
465 470 475 480 

lie Thr Ser Asp Thr Asn lie Glu Ala Ala Glu Glu Asn lie Ser Leu 
485 490 495 

Asp Leu lie Gin Gin Tyr Tyr Leu Thr Phe Asn Phe Asp Asn Glu Pro 
500 505 510 

Glu Asn lie Ser He Glu Asn Leu Ser Ser Asp lie lie Gly Gin Leu 
515 520 525 

Glu Leu Met Pro Asn lie Glu Arg Phe Pro Asn Gly Lys Lys Tyr Glu 
530 535 540 

Leu Asp Lys Tyr Thr Met Phe His Tyr Leu Arg Ala Gin Glu Phe Glu 

His Gly Lys Ser Arg lie Ala Leu Thr Asn Ser Val Asn Glu Ala Leu 
565 570 575 

Leu Asn Pro Ser Arg Val Tyr Thr Phe Phe Ser Ser Asp Tyr Val Lys 
580 585 590 

Lys Val Asn Lys Ala Thr Glu Ala Ala Met Phe Leu Gly Trp Val Glu 
595 600 605 

Gin Leu Val Tyr Asp Phe Thr Asp Glu Thr Ser Glu Val Ser Thr Thr 
610 615 620 

Asp Lys He Ala Asp He Thr lie lie lie Pro Tyr lie Gly Pro Ala 
625 630 635 640 

Leu Asn lie Gly Asn Met Leu Tyr Lys Asp Asp Phe Val Gly Ala Leu 
645 650 655 

lie Phe Ser Gly Ala Val He Leu Leu Glu Phe lie Pro Glu lie Ala 
660 665 670 



lie Pro Val Leu Gly Thr Phe Ala Leu Val Ser Tyr He Ala Asn Lys 
675 680 685 
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Val Leu Thr Val Gin Thr lie Asp Asn Ala Leu Ser Lys Arg Asn Glu 
690 695 700 

Lvs Trp Asp Glu Val Tyr Lys Tyr lie Val Thr Asn Trp Leu Ala Lys 
705 710 715 720 

Val Asn Thr Gin lie Asp Leu lie Arg Lys Lys Met Lys Glu Ala Leu 
725 730 735 

Glu Asn Gin Ala Glu Ala Thr Lys Ala He He Asn Tyr Gin Tyr Asn 
740 745 750 

Gin Tyr Thr Glu Glu Glu Lys Asn Asn lie Asn- Phe Asn lie Asp Asp 
755 760 765 

Leu Ser Ser Lys Leu Asn Glu Ser He Asn Lys Ala Met He Asn He 
770 77 5 780 

Asn Lvs Phe Leu Asn Gin Cys Ser Val Ser Tyr Leu Met Asn Ser Met 
785 790 795 800 

He Pro Tyr Gly Val Lys Arg Leu Glu Asp Phe Asp Ala Ser Leu Lys 
805 81° 815 

Asp Ala Leu Leu Lys Tyr He Tyr Asp Asn Arg Gly Thr Leu He Gly 
820 825 830 

Gin Val Asp Arg Leu Lys Asp Lys Val Asn Asn Thr Leu Ser Thr Asp 
8 35 840 845 

He Pro Phe Gin Leu Ser Lys Tyr Val Asp Asn Gin Arg Leu Leu Ser 
850 855 860 

Thr Phe Thr Glu Tyr He Lys 
865 870 

(2) INFORMATION FOR SEQ ID NO : 9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2628 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

{ i i ) MOLECULE TYPE : DNA ( genomi c ) 

(ix) FEATURE: 

(A) NAME / KEY : CDS 

(B) LOCATION:!. .2628 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 



ATG CAG TTC GTG AAC AAG CAG TTC AAC TAT AAG GAC CCT GTA AAC GGT 4 8 

Met Gin Phe Val Asn Lys Gin Phe Asn Tyr Lys Asp Pro Val Asn Gly 
1 5 1° " 

GTT GAC ATT GCC TAC ATC AAA ATT CCA AAC GCC GGC CAG ATG CAG CCG 96 
Sal Sp Sa Tyr He Lys He Pro Asn Ala Gly Gin Met Gin Pro 

20 25 30 

GTG AAG GCT TTC AAG ATT CAT AAC AAA ATC TGG GTT ATT CCG GAA CGC 144 
SS All Phe Lys lie His Asn Lys He Trp Val He Pro Glu Arg 

35 40 4b 

GAT ACA TTT ACG AAC CCG GAA GAA GGA GAC TTG AAC CCG CCG CCG GAA 
tsl Thr* Phe Thr Asn Pro Glu Glu Gly Asp Leu Asn Pro Pro Pro Glu 
50 55 



192 
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GCA AAG CAG GTG CCA GTT TCA TAC TAC GAT TCA ACC TAT CTG AGC ACA 2 4 0 

Ala Lys Gin Val Pro Val Ser Tyr Tyr Asp Ser Thr Tyr Leu Ser Th^ 
65 70 75 so 

GAC AAC GAG AAG GAT AAC TAC CTG AAG GGA GTG ACC AAA TTA TTC GAG 288 
Asp Asn Glu Lys Asp Asn Tyr Leu Lys Gly Val Thr Lys Leu Phe Glu 
85 90 95 

CGT ATT TAT TCC ACT GAC CTG GGC CGT ATG CTG CTG ACC TCA ATC GTC 33 6 

Arg lie Tyr Ser Thr Asp Leu Gly Arg Met Leu Leu Thr Ser He Val 
100 105 HO 

CGC GGA ATC CCA TTT TGG GGT GGC AGT ACC ATT GAC ACG GAG TTG AAG 384 
Arg Gly He Pro Phe Trp Gly Gly Ser Thr He Aso Thr Glu Leu Lys 
115 120 * 125 

GTT ATT GAC ACT AAC TGC ATT AAC GTG ATC CAA CCA GAC GGT AGC TAC 43 2 

Val He Asp Thr Asn Cys He Asn Val He Gin Pro Asp Gly Ser Tyr 
130 135 140 

AGA TCT GAA GAA CTT AAC CTC GTA ATC ATC GGG CCC TCC GCG GAC ATT 4 80 

Arg Ser Glu Glu Leu Asn Leu Val He He Gly Pro Ser Ala Asp He 
145 150 155 160 

ATC CAG TTT GAG TGC AAG AGC TTT GGC CAC GAA GTG TTG AAC CTG ACG 52 8 

lie Gin Phe Glu Cys Lys Ser Phe Gly His Glu Val Leu Asn Leu Thr 
165 170 175 

CGT AAC GGT TAC GGC TCT ACT CAG TAC ATT CGT TTC AGC CCA GAC TTC 576 
Arg Asn Gly Tyr Gly Ser Thr Gin Tyr He Arg Phe Ser Pro Asp Phe 
180 185 190 

ACG TTC GGT TTC GAG GAG AGC CTG GAG GTT GAT ACC AAC CCG CTG TTG 624 
Thr Phe Gly Phe Glu Glu Ser Leu Glu Val Asp Thr Asn Pro Leu Leu 
195 200 205 

GGT GCA GGC AAG TTC GCA ACT GAT CCA GCG GTG ACC CTG GCA CAC GAG 672 
Gly Ala Gly Lys Phe Ala Thr Asp Pro Ala Val Thr Leu Ala His Glu 
210 215 220 

CTG ATC CAC GCC GGT CAT CGT CTG TAT GGC ATT GCG ATT AAC CCG AAC 720 
Leu He His Ala Gly His Arg Leu Tyr Gly lie Ala He Asn Pro Asn 
225 230 235 240 

CGC GTG TTC AAG GTT AAC ACC AAC GCC TAC TAC GAG ATG AGT GGT TTA 768 
Arg Val Phe Lys Val Asn Thr Asn Ala Tyr Tyr Glu Met Ser Gly Leu 
245 250 25S 

GAA GTA AGC TTC GAG GAA CTG CGC ACG TTC GGT GGC CAT GAT GCG AAG 816 
Glu Val Ser Phe Glu Glu Leu Arg Thr Phe Gly Gly His Asp Ala Lys 
260 265 270 

TTT ATC GAC AGC TTG CAG GAG AAC GAG TTC CGT CTG TAC TAC TAC AAC 864 
Phe He Asp Ser Leu Gin Glu Asn Glu Phe Arg Leu Tyr Tyr Tyr Asn 
275 280 285 

AAG TTT AAA GAT ATT GCA AGT ACA CTG AAC AAG GCT AAG TCC ATT GTG 912 
Lys Phe Lys Asp He Ala Ser Thr Leu Asn Lys Ala Lys Ser He Val 
290 295 300 

GGT ACC ACT GCT TCA TTA CAG TAT ATG AAA AAT GTT TTT AAA GAG AAA 960 

Giy^hr"Thr^l^^ex~Leu^±n-~Tyr"Me t Ly s~Aot— Val™ Ffre— Ly s _ Glm~Ly s 

305 310 315 320 

TAT CTC CTA TCT GAA GAT ACA TCT GGA AAA TTT TCG GTA GAT AAA TTA 1008 
Tyr Leu Leu Ser Glu Asp Thr Ser Gly Lys Phe Ser Val Asp Lys Leu 
325 330 335 
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AAA TTT GAT AAG TTA TAC AAA ATG TTA AC A GAG ATT TAC ACA GAG GAT 1056 
Lys Phe Asp Lys Leu Tyr Lys Met Leu Thr Glu lie Tyr Thr Glu Asp 
340 345 350 

AAT TTT GTT AAG TTT TTT AAA GTA CTT AAC AGA AAA ACA TAT TTG AAT 1104 
Asn Phe Val Lys Phe Phe Lys Val Leu Asn Arg Lys Thr Tyr Leu Asn 
355 360 365 

TTT GAT AAA GCC GTA TTT AAG ATA AAT ATA GTA CCT AAG GTA AAT TAC 1152 
Phe Asp Lys Ala Val Phe Lys lie Asn He Val Pro Lys Val Asn Tyr 
370 375 380 

ACA ATA TAT GAT GGA TTT AAT TTA AGA AAT ACA AAT TTA GCA GCA AAC 120 0 

Thr He Tyr Asp Gly Phe Asn Leu Arg Asn Thr Asn Leu Ala Ala Asn 
385 390 395 400 

TTT AAT GGT CAA AAT ACA GAA ATT AAT AAT ATG AAT TTT ACT AAA CTA 124 8 

Phe Asn Gly Gin Asn Thr Glu He Asn Asn Met Asn Phe Thr Lys Leu 
405 410 415 

AAA AAT TTT ACT GGA TTG TTT GAA TTT TAT AAG TTG CTA TGT GTA AGA 12 96 

Lys Asn Phe Thr Gly Leu Phe Glu Phe Tyr Lys Leu Leu Cys Val Arg 
420 425 430 . 

GGG ATA ATA ACT TCT AAA ACT AAA TCA TTA GAT AAA GGA TAC AAT AAG 134 4 

Gly He He Thr Ser Lys Thr Lys Ser Leu Asp Lys Gly Tyr Asn Lys 
435 440 445 

AGC GCT GAT GGG GCA TTA AAT GAT TTA TGT ATC AAA GTT AAT AAT TGG 13 92 

Ser Ala Asp Gly Ala Leu Asn Asp Leu Cys He Lys Val Asn Asn Trp 
450 455 460 

GAC TTG TTT TTT AGT CCT TCA GAA GAT AAT TTT ACT AAT GAT CTA AAT 144 0 

Asp Leu Phe Phe Ser Pro Ser Glu Asp Asn Phe Thr Asn Asp Leu Asn 
465 470 475 480 

AAA GGA GAA GAA ATT ACA TCT GAT ACT AAT ATA GAA GCA GCA GAA GAA 14 8 8 

Lys Gly Glu Glu He Thr Ser Asp Thr Asn He Glu Ala Ala Glu Glu 
485 490 495 

AAT ATT AGT TTA GAT TTA ATA CAA CAA TAT TAT TTA ACC TTT AAT TTT 1536 
Asn He Ser Leu Asp Leu lie Gin Gin Tyr Tyr Leu Thr Phe Asn Phe 
500 505 510 

GAT AAT GAA CCT GAA AAT ATT TCA ATA GAA AAT CTT TCA AGT GAC ATT 1584 
Asp Asn Glu Pro Glu Asn He Ser He Glu Asn Leu Ser Ser Asp lie 
515 520 525 

ATA GGC CAA TTA GAA CTT ATG CCT AAT ATA GAA AGA TTT CCT AAT GGA 1632 
lie Gly Gin Leu Glu Leu Met Pro Asn He Glu Arg Phe Pro Asn Gly 
530 535 540 

AAA AAG TAT GAG TTA GAT AAA TAT ACT ATG TTC CAT TAT CTT CGT GCT 1680 
Lys Lys Tyr Glu Leu Asp Lys Tyr Thr Met Phe His Tyr Leu Arg Ala 
545 550 555 560 

CAA GAA TTT GAA CAT GGT AAA TCT AGG ATT GCT TTA ACA AAT TCT GTT 172 8 

Gin Glu Phe Glu His Gly Lys Ser Arg He Ala Leu Thr Asn Ser Val 
565 570 575 

AAC GAA GCA TTA TTA AAT CCT AGT CGT GTT TAT ACA TTT TTT TCT TCA 17 7 6 

Asn Glu Ala Leu Leu Asn Pro Ser Arg Val Tyr Thr Phe Phe Ser Ser 
580 585 590 

GAC TAT GTA AAG AAA GTT AAT AAA GCT ACG GAG GCA GCT ATG TTT TTA 1824 
Asp Tyr Val Lys Lys Val Asn Lys Ala Thr Glu Ala Ala Met Phe Leu 
595 600 605 
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GGC TGG GTA GAA CAA TTA GTA TAT GAT TTT ACC GAT GAA ACT AGC GAA 
Gly Trp Val Glu Gin Leu Val Tyr Asp Phe Thr Asp Glu Thr Ser Glu 
610 615 620 



1872 



GTA AGT ACT ACG GAT AAA ATT GCG GAT ATA ACT ATA ATT ATT CCA TAT 
Val Ser Thr Thr Asp Lys He Ala Asp lie Thr He He He Pro Tyr 
625 630 635 6 4 0 

ATA GGA CCT GCT TTA AAT ATA GGT AAT ATG TTA TAT AAA GAT GAT TTT 
lie Gly Pro Ala Leu Asn He Gly Asn Met Leu Tyr Lys Asp Asp Phe 
64 5 650 655 

GTA GGT GCT TTA ATA TTT TCA GGA GCT GTT ATT CTG TTA GAA TTT ATA 
Val Gly Ala Leu lie Phe Ser Gly Ala Val lie Leu Leu Glu Phe He 
660 665 670 

CCA GAG ATT GCA ATA CCT GTA TTA GGT ACT TTT GCA CTT GTA TCA TAT 
Pro Glu lie Ala lie Pro Val Leu Gly Thr Phe Ala Leu Val Ser Tyr 
675 680 685 



1920 



1968 



2016 



2064 



ATT GCG AAT AAG GTT CTA ACC GTT CAA ACA ATA GAT AAT GCT TTA AGT 
lie Ala Asn Lys Val Leu Thr Val Gin Thr lie Asp Asn Ala Leu Ser 
690 695 700 



2112 



AAA AGA AAT GAA AAA TGG GAT GAG GTC TAT AAA TAT ATA GTA ACA AAT 216 0 

Lys Arg Asn Glu Lys Trp Asp Glu Val Tyr Lys Tyr lie Val Thr Asn 
705 710 715 720 

TGG TTA GCA AAG GTT AAT ACA CAG ATT GAT CTA ATA AGA AAA AAA ATG 2208 
Trp Leu Ala Lys Val Asn Thr Gin lie Asp Leu lie Arg Lys Lys Met 
725 730 735 



AAA GAA GCT TTA GAA AAT CAA GCA GAA GCA ACA AAG GCT ATA ATA AAC 
Lys Glu Ala Leu Glu Asn Gin Ala Glu Ala Thr Lys Ala lie lie Asn 
740 745 750 



2256 



TAT CAG TAT AAT CAA TAT ACT GAG GAA GAG AAA AAT AAT ATT AAT TTT 
Tyr Gin Tyr Asn Gin Tyr Thr Glu Glu Glu Lys Asn Asn lie Asn Phe 
755 760 765 



2304 



AAT ATT GAT GAT TTA AGT TCG AAA CTT AAT GACl TPT ATA A AT aaa r*r-T 
Asn lie Asp Asp Leu Ser Ser Lys Leu Asn Glu Ser He Asn Lys Ala 
770 775 780 



ATG ATT AAT ATA AAT AAA TTT TTG AAT CAA TGC TCT GTT TCA TAT TTA 

Met He Asn lie Asn Lys Phe Leu Asn Gin Cys Ser Val Ser Tyr Leu 

785 790 795 800 

ATG AAT TCT ATG ATC CCT TAT GGT GTT AAA CGG TTA GAA GAT TTT GAT 

Met Asn Ser Met He Pro Tyr Gly Val Lys Arg Leu Glu Asp Phe Asp 
805 810 815 



2400 



2448 



GCT AGT CTT AAA GAT GCA TTA TTA AAG TAT ATA TAT GAT AAT AGA GGA 
Ala Ser Leu Lys Asp Ala Leu Leu Lys Tyr lie Tyr Asp Asn Arg Gly 
820 825 830 



2496 



ACT TTA ATT GGT CAA GTA GAT AGA TTA AAA GAT AAA GTT AAT AAT ACA 
Thr Leu lie Gly Gin Val Asp Arg Leu Lys Asp Lys Val Asn Asn Thr 
835 840 845 



2544 



CTT AGT ACA GAT ATA CCT TTT CAG CTT TCC AAA TAC GTA GAT AAT CAA 
Leu Ser Thr Asp He Pro Phe Gin Leu Ser Lys Tyr Val Asp Asn Gin 

BS0 8^5 860 



2592 



AGA TTA TTA TCT ACA TTT ACT GAA TAT ATT AAG TAA 
Arg Leu Leu Ser Thr Phe Thr Glu Tyr lie Lys * 
865 870 875 



2628 
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(2) INFORMATION FOR SEQ ID NO: 10: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 876 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 

Met Gin Phe Val Asn Lys Gin Phe Asn Tyr Lys Asp Pro Val Asn Gly 
15 10 15 

Val Asp lie- Ala Tyr lie Lys lie Pro Asn Ala Gly Gin Met Gin Pro 
20 25 30 

Val Lys Ala Phe Lys lie His Asn Lys lie Trp Val lie Pro Glu Arg 
35 40 45 

Asp Thr Phe Thr Asn Pro Glu Glu Gly Asp Leu Asn Pro Pro Pro Glu 
50 55 60 

Ala Lys Gin Val Pro Val Ser Tyr Tyr Asp Ser Thr Tyr Leu Ser Thr 
€5 70 75 BO 

Asp Asn Glu Lys Asp Asn Tyr Leu Lys Gly Val Thr Lys Leu Phe Glu 

85 90 95 

Arg lie Tyr Ser Thr Asp Leu Gly Arg Met Leu Leu Thr Ser lie Val 
100 105 110 

Arg Gly lie Pro Phe Trp Gly Gly Ser Thr lie Asp Thr Glu Leu Lys 
115 120 125 

Val He Asp Thr Asn Cys He Asn Val He Gin Pro Asp Gly Ser Tyr 
130 135 140 

Arg Ser Glu Glu Leu Asn Leu Val He He Gly Pro Ser Ala Asp He 
145 150 155 160 

He Gin Phe Glu Cys Lys Ser Phe Gly His Glu Val Leu Asn Leu Thr 
165 170 175 

Arg Asn Gly Tyr Gly Ser Thr Gin Tyr He Arg Phe Ser Pro Asp Phe 
180 185 190 

Thr Phe Gly Phe Glu Glu Ser Leu Glu Val Asp Thr Asn Pro Leu Leu 
195 200 205 

Gly Ala Gly Lys Phe Ala Thr Asp Pro Ala Val Thr Leu Ala His Glu 
210 215 220 

Leu He His Ala Gly His Arg Leu Tyr Gly He Ala He Asn Pro Asn 
225 230 235 240 

Arg Val Phe Lys Val Asn Thr Asn Ala Tyr Tyr Glu Met Ser Gly Leu 
245 250 255 

Glu Val Ser Phe Glu Glu Leu Arg Thr Phe Gly Gly His Asp Ala Lys 
260 265 270 

Phe He Asp Ser Leu Gin Glu Asn Glu Phe Arg Leu Tyr Tyr Tyr Asn 
275 280 285 

Lys Phe Lys Asp He Ala Ser Thr Leu Asn Lys Ala Lys Ser He Val 
290 295 300 
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Gly Thr Thr Ala Ser Leu Gin Tyr Met Lys Asn Val Phe Lys Glu Lys 
305 310 315 320 

Tyr Leu Leu Ser Glu Asp Thr Ser Gly Lys Phe Ser Val Asp Lvs Leu 
325 330 335 

Lys Phe Asp Lys Leu Tyr Lys Met Leu Thr Glu He Tyr Thr Glu Asp 
340 345 350 

Asn Phe Val Lys Phe Phe Lys Val Leu Asn Arg Lys Thr Tyr Leu Asn 
355 360 365 

Phe Asp Lys Ala Val Phe Lys He Asn He Val Pro Lys Val Asn Tvr 
370 375 380 

Thr lie Tyr Asp Gly Phe Asn Leu Arg Asn Thr Asn Leu Ala Ala Asn 
385 390 395 400 

Phe Asn Gly Gin Asn Thr Glu He Asn Asn Met Asn Phe Thr Lys Leu 
405 410 415 

Lys Asn Phe Thr Gly Leu Phe Glu Phe Tyr Lys Leu Leu Cys Val Arq 
420 425 430 

Gly He He Thr Ser Lys Thr Lys Ser Leu Asp Lys Gly Tyr Asn Lys 
435 440 445 

Ser Ala Asp Gly Ala Leu Asn Asp Leu Cys He Lys Val Asn Asn Trp 
450 455 460 

Asp Leu Phe Phe Ser Pro Ser Glu Asp Asn Phe Thr Asn Asp Leu Asn 
465 470 475 480 

Lys Gly Glu Glu lie Thr Ser Asp Thr Asn lie Glu Ala Ala Glu Glu 
485 490 495 

Asn He Ser Leu Asp Leu He Gin Gin Tyr Tyr Leu Thr Phe Asn Phe 
500 505 sio 

Asp Asn Glu Pro Glu Asn lie Ser He Glu Asn Leu Ser Ser Asp He 
515 520 525 

lie Gly Gin Leu Glu Leu Met Pro Asn lie Glu Arg Phe Pro Asn Gly 
530 535 540 

Lys Lys Tyr Glu Leu Asp Lys Tyr Thr Met Phe His Tyr Leu Arg Ala 
545 550 555 560 

Gin Glu Phe Glu His Gly Lys Ser Arg He Ala Leu Thr Asn Ser Val 
565 570 575 

Asn Glu Ala Leu Leu Asn Pro Ser Arg Val Tyr Thr Phe-Phe Ser Ser 
580 585 590 

Asp Tyr Val Lys Lys Val Asn Lys Ala Thr Glu Ala Ala Met Phe Leu 
595 600 $05 

Gly Trp Val Glu Gin Leu Val Tyr Asp Phe Thr Asp Glu Thr Ser Glu 
610 615 620 

Val Ser Thr Thr Asp Lys He Ala Asp He Thr lie lie lie Pro Tyr 
625 630 635 640 



lie Gly Pro Ala Leu Asn He Gly Asn Met Leu Tyr Lys Asp Asp Phe 
645 650 655 
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Val Gly Ala Leu lie Phe Ser Gly Ala Val lie Leu Leu Glu Phe He 
660 665 670 

Pro Glu lie Ala He Pro Val Leu Gly Thr Phe Ala Leu Val Ser Tyr 
675 680 ' 685 

He Ala Asn Lys Val Leu Thr Val Gin Thr He Asp Asn Ala Leu Ser 
690 695 700 

Lys Arg Asn Glu Lys Trp Asp Glu Val Tyr Lys Tyr He Val Thr Asn 
705 710 715 720 

Trp Leu Ala Lys -Val Asn Thr Gin lie Asp Leu Me Arg Lys Lys Met 
725 730 735 

Lys Glu Ala Leu Glu Asn Gin Ala Glu Ala Thr Lys Ala lie He Asn 
740 745 750 

Tyr Gin Tyr Asn Gin Tyr Thr Glu Glu Glu Lys Asn Asn He Asn Phe 
755 760 765 

Asn He Asp Asp Leu Ser Ser Lys Leu Asn Glu Ser He Asn Lys Ala 
770 775 780 

Met He Asn He Asn Lys Phe Leu Asn Gin Cys Ser Val Ser Tyr Leu 
785 790 795 800 

Met Asn Ser Met He Pro Tyr Gly Val Lys Arg Leu Glu Asp Phe Asp 
805 810 815 

Ala Ser Leu Lys Asp Ala Leu Leu Lys Tyr He Tyr Asp Asn Arg Gly 
820 825 830 

Thr Leu He Gly Gin Val Asp Arg Leu Lys Asp Lys Val Asn Asn Thr 
835 840 845 

Leu Ser Thr Asp He Pro Phe Gin Leu Ser Lys Tyr Val Asp Asn Gin 
850 855 860 

Arg Leu Leu Ser Thr Phe Thr Glu Tyr He Lys * 
865 870 875 

(2) INFORMATION FOR SEQ ID NO: 11: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 63 7 base pairs 

(B) TYPE: nucleic acid 

(C) STRAND EDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE _XYPE : DNA (genomic) 

(ix) FEATURE: 

(A) NAME / KEY : CDS 

(B) LOCATION : 1 . .263 7 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 

ATG CAG TTC GTG AAC AAG CAG TTC AAC TAT AAG GAC CCT GTA AAC GGT 4 8 

Met Gin Phe Val Asn Lys Gin Phe Asn Tyr Lys Asp Pro Val Asn Gly 
15 10 15 

GTT GAC ATT GCC TAC ATC AAA ATT CCA AAC GCC GGC CAG ATG CAG CCG 96 
Val Asp He Ala Tyr lie Lys lie Pro Asn Ala Gly Gin Met Gin Pro 
20 25 30 
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GTG AAG GCT TTC AAG ATT CAT AAC AAA ATC TGG GTT ATT CCG GAA CGC 
Val Lys Ala Phe Lys lie His Asn Lys He Trp Val He Pro Glu Arq 
35 40 45 



GAT ACA TTT ACG AAC CCG GAA GAA GGA GAC TTG AAC CCG CCG CCG GAA 
Asp Thr Phe Thr Asn Pro Glu Glu Gly Asp Leu Asn Pro Pro Pro Gin 
50 55 60 



144 



192 



GCA AAG CAG GTG CCA GTT TCA TAC TAC GAT TCA ACC TAT CTG AGC ACA 
Ala Lys Gin Val Pro Val Ser Tyr Tyr Asp Ser Thr Tyr Leu Ser Thr 
65 70 75 80 

GAC AAC GAG AAG GAT AAC TAC CTG AAG GGA GTG ACC AAA TTA TTC GAG 
Asp Asn Glu Lys Asp Asn Tyr Leu Lys Gly Val Thr Lys Leu Phe Glu 
85 90 95 

CGT ATT TAT TCC ACT GAC CTG GGC CGT ATG CTG CTG ACC TCA ATC GTC 
Arg He Tyr Ser Thr Asp Leu Gly Arg Met Leu Leu Thr Ser He Val 
100 105 no 



240 



288 



336 



CGC GGA ATC CCA TTT TGG GGT GGC AGT ACC ATT GAC ACG GAG TTG AAG 
Arg Gly He Pro Phe Trp Gly Gly Ser Thr He Asp Thr Glu Leu Lvs 
115 120 125 



384 



GTT ATT GAC ACT AAC TGC ATT AAC GTG ATC CAA CCA GAC GGT AGC TAC 
Val lie Asp Thr Asn Cys He Asn Val He Gin Pro Asp Gly Ser Tvr 
130 135 140 



432 



AGA TCT GAA GAA CTT AAC CTC GTA ATC ATC GGG CCC TCC GCG GAC ATT 
Arg Ser Glu Glu Leu Asn Leu Val He He Gly Pro Ser Ala Asd He 
145 150 155 * i 6 o 

ATC CAG TTT GAG TGC AAG AGC TTT GGC CAC GAA GTG TTG AAC CTG ACG 
He Gin Phe Glu Cys Lys Ser Phe Gly His Glu Val Leu Asn Leu Thr 
165 170 175 

CGT AAC GGT TAC GGC TCT ACT CAG TAC ATT CGT TTC AGC CCA GAC TTC 
Arg Asn Gly Tyr Gly Ser Thr Gin Tyr He Arg Phe Ser Pro Asp Phe 
180 185 190 



480 



528 



576 



ACG TTC GGT TTC GAG GAG AGC CTG GAG GTT GAT ACC AAC CCG CTG TTG 

tVi*- OU« /-n,. nu« m m „ o T ,-,1 » * . _ 



195 



200 



205 



624 



GGT GCA GGC AAG TTC GCA ACT GAT CCA GCG GTG ACC CTG GCA CAC GAG 
Gly Ala Gly Lys Phe Ala Thr Asp Pro Ala Val Thr Leu Ala His Glu 
210 215 220 



672 



CTG ATC CAC GCC GGT CAT CGT CTG TAT GGC ATT GCG ATT AAC CCG AAC 
Leu He His Ala Gly His Arg Leu Tyr Gly He Ala He Asn Pro Asn 
225 230 235 240 

CGC GTG TTC AAG GTT AAC ACC AAC GCC TAC TAC GAG ATG AGT GGT TTA 
Arg Val Phe Lys Val Asn Thr Asn Ala Tyr Tyr Glu Met Ser Gly Leu 
245 250 ( 255 

GAA GTA AGC TTC GAG GAA CTG CGC ACG TTC GGT GGC CAT GAT GCG AAG 
Glu Val Ser Phe Glu Glu Leu Arg Thr Phe Gly Gly His Asp Ala Lys 
260 265 270 



720 



768 



816 



TTT ATC GAC AGC TTG CAG GAG AAC GAG TTC CGT CTG TAC TAC TAC AAC 
Phe He Asp Ser Leu Gin Glu Asn Glu Phe Arg Leu Tyr Tyr Tyr Asn 
275 280 285 



864 



AAG TTT AAA GAT ATT GCA AGT ACA CTG AAC AAG GCT AAG TCC ATT GTG 
Lys Phe Lys Asp He Ala Ser Thr Leu Asn Lys Ala Lys Ser lie Val 
290 295 300 



912 
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GGT ACC ACT GCT TCA TTA CAG TAT ATG AAA AAT GTT TTT AAA GAG AAA 96 0 

Gly Thr Thr Ala Ser Leu Gin Tyr Met Lys Asn Val Phe Lys Glu Lys 
305 310 315 320 



TAT CTC CTA TCT GAA GAT ACA TCT GGA AAA TTT TCG GTA GAT AAA TTA 
Tyr Leu Leu Ser Glu Asp Thr Ser Gly Lys Phe Ser Val Asp Lys Leu 
325 330 335 



1008 



AAA TTT GAT AAG TTA TAC AAA ATG TTA ACA GAG ATT TAC ACA GAG GAT 105 6 

Lys Phe Asp Lys Leu Tyr Lys Met Leu Thr Glu lie Tyr Thr Glu Asp 
340 345 350 

AAT TTT GTT AAG TTT TTT AAA GTA CTT AAC AG A AAA ACA TAT TTG AAT 1104 
Asn Phe Val Lys Phe Phe Lys Val Leu Asn Arg Lys Thr Tyr Leu Asn 
355 360 365 

TTT GAT AAA GCC GTA TTT AAG ATA AAT ATA GTA CCT AAG GTA AAT TAC 115 2 

Phe Asp Lys Ala Val Phe Lys lie Asn lie Val Pro Lys Val Asn Tyr 
370 375 380 

ACA ATA TAT GAT GGA TTT AAT TTA AGA AAT ACA AAT TTA GCA GCA AAC 120 0 

Thr He Tyr Asp Gly Phe Asn Leu Arg Asn Thr Asn Leu Ala Ala Asn 
385 390 395 400 

TTT AAT GGT CAA AAT ACA GAA ATT AAT AAT ATG AAT TTT ACT AAA CTA 124 8 

Phe Asn Gly Gin Asn Thr Glu He Asn Asn Met Asn Phe Thr Lvs Leu 
405 410 4l5 

AAA AAT TTT ACT GGA TTG TTT GAA TTT TAT AAG TTG CTA TGT GTA AGA 12 96 

Lys Asn Phe Thr Gly Leu Phe Glu Phe Tyr Lys Leu Leu Cys Val Arg 
420 425 430 

GGG ATA ATA ACT TCT AAA ACT AAA TCA TTA GAT AAA GGA TAC AAT AAG 134 4 

Gly He He Thr Ser Lys Thr Lys Ser Leu Asp Lys Gly Tyr Asn Lys 
435 440 445 

ATC GAA GGT CGT TGC GAT GGG GCA TTA AAT GAT TTA TGT ATC AAA GTT 13 92 

He Glu Gly Arg Cys Asp Gly Ala Leu Asn Asp Leu Cys He Lys Val 
450 455 460 

AAT AAT TGG GAC TTG TTT TTT AGT CCT TCA GAA GAT AAT TTT ACT AAT 14 4 0 

Asn Asn Trp Asp Leu Phe Phe Ser Pro Ser Glu Asp Asn Phe Thr Asn 
465 470 475 480 

GAT CTA AAT AAA GGA GAA GAA ATT ACA TCT GAT ACT AAT ATA GAA GCA 14 8 8 

Asp Leu Asn Lys Gly Glu Glu He Thr Ser Asp Thr Asn He Glu Ala 
485 490 495 

GCA GAA GAA AAT ATT AGT TTA GAT TTA ATA CAA CAA TAT TAT TTA ACC 153 6 

Ala Glu Glu Asn He Ser Leu Asp Leu He Gin Gin Tyr Tyr Leu Thr 
500 505 510 

TTT AAT TTT GAT AAT GAA CCT GAA AAT ATT TCA ATA GAA AAT CTT TCA 1584 
Phe Asn Phe Asp Asn Glu Pro Glu Asn He Ser He Glu Asn Leu Ser 
515 520 525 

AGT GAC ATT ATA GGC CAA TTA GAA CTT ATG CCT AAT ATA GAA AGA TTT 16 3 2 

Ser Asp He He Gly Gin Leu Glu Leu Met Pro Asn He Glu Arg Phe 
530 535 540 

CCT AAT GGA AAA AAG TAT GAG TTA GAT AAA TAT ACT ATG TTC CAT TAT 1680 
Pro Asn Gly Lys Lys Tyr Glu Leu Asp Lys Tyr Thr Met Phe His Tyr 
545 550 1555 560 

CTT CGT GCT CAA GAA TTT GAA CAT GGT AAA TCT AGG ATT GCT TTA ACA 172 8 

Leu Arg Ala Gin Glu Phe Glu His Gly Lys Ser Arg He Ala Leu Thr 
565 570 575 
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AAT TCT GTT AAC GAA GCA TTA TTA AAT CCT ACT CGT GTT TAT ACA TTT 
Asn Ser Val Asn Glu Ala Leu Leu Asn Pro Ser Arg Val Tyr Thr J5e 1 76 

580 585 590 

TTT TCT TCA GAC TAT GTA AAG AAA GTT AAT AAA GCT ACG GAG GCA GCT 
Phe Ser Ser Asp Tyr Val Lys Lys Val Asn Lys Ala Thr Glu Ala S 
595 600 60S 

TGG ^ t ^ 

Met Phe Leu Gly Trp Val Glu Gin Leu Val Tyr Asp Phe Thr Asp Glu 
610 615 620 

ACT AGC GAA GTA AGT ACT ACG GAT AAA ATT GCG GAT ATA ACT ATA ATT 
Thr Ser Glu Val Ser Thr Thr Asp Lys lie Ala Asp lie £hr He ill 
625 630 635 640 

ATT CCA TAT ATA GGA CCT GCT TTA AAT ATA GGT AAT ATG TTA TAT AAA 
lie Pro Tyr lie Gly Pro Ala Leu Asn lie Gly Asn Met Leu Tyr Lys 
645 650 6 5 5 

GAT GAT TTT GTA GGT GCT TTA ATA TTT TCA GGA GCT GTT ATT CTG TTA 
Asp Asp Phe Val Gly Ala Leu He Phe Ser Gly Ala Val He Leu Leu 
660 665 670 

GAA TTT ATA CCA GAG ATT GCA ATA CCT GTA TTA GGT ACT TTT GCA CTT 
Glu Phe lie Pro Glu He Ala lie Pro Val Leu Gly ?£ J£ Ma 
675 680 685 

GTA TCA TAT ATT GCG AAT AAG GTT CTA ACC GTT CAA ACA ATA GAT AAT 
Val Ser Tyr lie Ala Asn Lys Val Leu Thr Val Gin Thr lie Asp £n 
690 695 700 



GCT TTA AGT AAA AGA AAT GAA AAA TGG GAT GAG GTC TAT AAA TAT ATA 
Ala Leu Ser Lys Arg Asn Glu Lys Trp Asp Glu Val Tyr Lys Tyr He 
705 710 715 720 

GTA ACA AAT TGG TTA GCA AAG GTT AAT ACA CAG ATT GAT CTA ATA AGA 
Val Thr Asn Trp Leu Ala Lys Val Asn Thr Gin He Asp Leu He Arg 
725 7-?n 

AAA AAA ATG AAA GAA GCT TTA GAA AAT CAA GCA GAA GCA ACA AAG GCT 
Lys Lys Met Lys Glu Ala Leu Glu Asn Gin Ala Glu Ala Thr Lvs Ala 
740 745 750 

ATA ATA AAC TAT CAG TAT AAT CAA TAT ACT GAG GAA GAG AAA AAT AAT 
lie He Asn Tyr Gin Tyr Asn Gin Tyr Thr Glu Glu Glu Lys Asn Asn 
755 760 765 

ATT AAT TTT AAT ATT GAT GAT TTA AGT TCG AAA CTT AAT GAG TCT ATA 
He Asn Phe Asn He Asp Asp Leu Ser Ser Lys Leu Asn Glu Ser He 
770 775 780 

AAT AAA GCT ATG ATT AAT ATA AAT AAA TTT TTG AAT CAA TGC TCT GTT 
Asn Lys Ala Met He Asn He. Asn Lys Phe Leu Asn Gin Cys Ser Val 
78S 790 795 800 

TCA TAT TTA ATG AAT TCT ATG ATC CCT TAT GGT GTT AAA CGG TTA GAA 
Ser Tyr Leu Met Asn Ser Met He Pro Tyr Gly Val Lys Arg Leu Glu 
8°5 810 * 815 

Asp Phe Asp Ala Ser Leu Lys Asp Ala Leu Leu Lys Tyr He Tvr Asd 
820 825 830 

AAT AGA GGA ACT TTA ATT GGT CAA GTA GAT AGA TTA AAA GAT AAA GTT 
Asn Arg Gly Thr Leu He Gly Gin Val Asp Arg Leu Lvs Asp Lys Val 
835 840 S4 5 



1824 
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1920 



1968 
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2064 



2112 



2160 



2208 



2256 
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2400 



2448 



~2~4"9~6~ 
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AAT AAT ACA CTT AGT ACA GAT ATA CCT TTT CAG CTT TCC AAA TAC GTA 2592 
Asn Asn Thr Leu Ser Thr Asp lie Pro Phe Gin Leu Ser Lys Tyr Val 
850 855 860 

GAT AAT CAA AGA TTA TTA TCT ACA TTT ACT GAA TAT ATT AAG TAA 26 3 7 

Asp Asn Gin Arg Leu Leu Ser Thr Phe Thr Glu Tyr lie Lys * 
965 870 875 

(2) INFORMATION FOR SEQ ID NO: 12: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 879 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 

Met Gin Phe Val Asn Lys Gin Phe Asn Tyr Lys Asp Pro Val Asn Gly 
1 5 10 15 

Val Asp lie Ala Tyr lie Lys lie Pro Asn Ala Gly Gin Met Gin Pro 
20 25 30 

Val Lys Ala Phe Lys lie His Asn Lys lie Trp Val lie Pro Glu Arg 
35 40 45 

Asp Thr Phe Thr Asn Pro Glu Glu Gly Asp Leu Asn Pro Pro Pro Glu 
50 55 60 

Ala Lys Gin Val Pro Val Ser Tyr Tyr Asp Ser Thr Tyr Leu Ser Thr 
65 70 75 80 

Asp Asn Glu Lys Asp Asn Tyr Leu Lys Gly Val Thr Lys Leu Phe Glu 

85 90 95 

Arg lie Tyr Ser Thr Asp Leu Gly Arg Met Leu Leu Thr Ser lie Val 
100 105 110 

Arg Gly lie Pro Phe Trp Gly Gly Ser Thr lie Asp Thr Glu Leu Lys 
115 120 125 

Val lie Asp Thr Asn Cys lie Asn Val lie Gin Pro Asp Gly Ser Tyr 
130 ° * 135 140 

Arg Ser Glu Glu Leu Asn Leu Val lie He Gly Pro Ser Ala Asp He 
145 150 155 160 

He Gin Phe Glu Cys Lys Ser Phe Gly His Glu Val Leu Asn Leu Thr 
165 170 175 

Arg Asn Gly Tyr Gly Ser Thr Gin Tyr He Arg Phe Ser Pro Asp Phe 
180 185 190 

Thr Phe Gly Phe Glu Glu Ser Leu Glu Val Asp Thr Asn Pro Leu Leu 
195 200 205 

Gly Ala Gly Lys Phe Ala Thr Asp Pro Ala Val Thr Leu Ala His Glu 
210 215 220 

Leu He His Ala Gly His Arg Leu Tyr Gly He Ala He Asn Pro Asn 
225 230 235 240 

Arg Val Phe Lys Val Asn Thr Asn Ala Tyr Tyr Glu Met Ser Gly Leu 
245 250 255 
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Glu Val Ser Phe Glu Glu Leu Arg Thr Phe Gly Gly His Asp Ala Lys 
260 265 270 

Phe He Asp Ser Leu Gin Glu Asn Glu Phe Arg Leu Tyr Tyr Tyr Asn 
275 280 285 

Lys Phe Lys Asp He Ala Ser Thr Leu Asn Lys Ala Lys Ser He Val 
290 295 300 

Gly Thr Thr Ala Ser Leu Gin Tyr Met Lys Asn Val Phe Lys Glu Lys 
305 310 315 320 

Tyr Leu Leu Ser Glu Asp Thr Ser Gly Lys Phe Ser Val Asp Lys Leu 
325 330 335 

Lys Phe Asp Lys Leu Tyr Lys Met Leu Thr Glu He Tyr Thr Glu Asp 
340 345 350 

Asn Phe Val Lys Phe Phe Lys Val Leu Asn Arg Lys Thr Tyr Leu Asn 
355 360 365 

Phe Asp Lys Ala Val Phe Lys He Asn He Val Pro Lys Val Asn Tyr 
370 375 380 

Thr He Tyr Asp Gly Phe Asn Leu Arg Asn Thr Asn Leu Ala Ala Asn 
385 390 395 400 

Phe Asn Gly Gin Asn Thr Glu He Asn Asn Met Asn Phe Thr Lys Leu 
405 410 415 

Lys Asn Phe Thr Gly Leu Phe Glu Phe Tyr Lys Leu Leu Cys Val Arg 
420 425 430 

Gly lie He Thr Ser Lys Thr Lys Ser Leu Asp Lys Gly Tyr Asn Lys 
435 440 445 

He Glu Gly Arg Cys Asp Gly Ala Leu Asn Asp Leu Cys He Lys Val 
450 455 460 

Asn Asn Trp Asp Leu Phe Phe Ser Pro Ser Glu Asp Asn Phe Thr Asn 
465 470 475 480 

Asp Leu Asn Lys Gly Glu Glu He Thr Ser Asp Thr Asn lie Glu Ala 
485 490 495 

Ala Glu Glu Asn He Ser Leu Asp Leu He Gin Gin Tyr Tyr Leu Thr 
500 505 S10 

Phe Asn Phe Asp Asn Glu Pro Glu Asn lie Ser lie Glu Asn Leu Ser 
515 520 525 

Ser Asp He lie Gly Gin Leu Glu Leu Met Pro Asn lie Glu Arg Phe 
530 535 540 

Pro Asn Gly Lys Lys Tyr Glu Leu Asp Lys Tyr Thr Met Phe His Tyr 
545 550 555' 560 

Leu Arg Ala Gin Glu Phe Glu His Gly Lys Ser Arg lie Ala Leu Thr 
565 570 S75 



-Asn-Ser-Val~Asn--Glu--A^ 

580 585 590 

Phe Ser Ser Asp Tyr Val Lys Lys Val Asn Lys Ala Thr Glu Ala Ala 
595 600 605 
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Met Phe Leu Gly Trp Val Glu Gin Leu Val Tyr Asp ?he Thr Asp Glu 
610 615 620 

Thr Ser Glu Val Ser Thr Thr Asp Lys lie Ala Asp lie Thr lie lie 
625 630 635 640 

lie Pro Tyr lie Gly Pro Ala Leu Asn lie Gly Asn Met Leu Tyr Lys 
645 650 655 

Asp Asp Phe Val Gly Ala Leu lie Phe Ser Gly Ala Val lie Leu Leu 
660 665 670 

Glu Phe lie Pro Glu lie Ala lie Pro Val Leu Gly Thr Phe Ala Leu 
675 680 " 685 

Val Ser Tyr lie Ala Asn Lys Val Leu Thr Val Gin Thr lie Asp Asn 
690 695 700 

Ala Leu Ser Lys Arg Asn Glu Lys Trp Asp Glu Val Tyr Lys Tyr lie 
705 710 715 720 

Val Thr Asn Trp Leu Ala Lys Val Asn Thr Gin lie Asp Leu lie Arg 
725 730 735 

Lys Lys Met Lys Glu Ala Leu Glu Asn Gin Ala Glu Ala Thr Lys Ala 
740 745 750 

lie lie Asn Tyr Gin Tyr Asn Gin Tyr Thr Glu Glu Glu Lvs Asn Asn 
755 760 765 

He Asn Phe Asn He Asp Asp Leu Ser Ser Lys Leu Asn Glu Ser He 
770 775 780 

Asn Lys Ala Met He Asn He Asn Lys Phe Leu Asn Gin Cys Ser Val 
7 85 7 90 7 95 800 

Ser Tyr Leu Met Asn Ser Met He Pro Tyr Gly Val Lys Arg Leu Glu 
805 810 815 

Asp Phe Asp Ala Ser Leu Lys Asp Ala Leu Leu Lys Tyr He Tyr Asp 
820 825 830 

Asn Arg Gly Thr Leu He Gly Gin Val Asp Arg Leu Lys Asp Lys Val 
835 840 845 

Asn Asn Thr Leu Ser Thr Asp He Pro Phe Gin Leu Ser Lys Tyr Val 
850 855 860 

Asp Asn Gin Arg Leu Leu Ser Thr Phe Thr Glu Tyr He Lys * 
865 870 875 

(2) INFORMATION FOR SEQ ID NO : 13: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 8 62 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(ix) FEATURE: 

(A) NAME /KEY : CDS 

(B) LOCATION: 1. .2 862 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13 



WO 98/07864 



PCT/GB97/02273 



67 



ATG CAG TTC GTG 
Met Gin Phe Val 
1 

GTT GAC ATT GCC 
Val Asp He Ala 
20 

GTG AAG GCT TTC 
Val Lys Ala Phe 
35 

GAT ACA TIT ACG 
Asp Thr Phe Thr 
50 

GCA AAG CAG GTG 
Ala Lys Gin Val 
65 

GAC AAC GAG AAG 
Asp Asn Glu Lys 



AAC AAG CAG 
Asn Lys "Gin 
5 

TAC ATC AAA 
Tyr He Lys 



TTC AAC 
Phe Asn 



AAG ATT CAT 
Lys He His 



CGT ATT TAT TCC 
Arg He Tyr Ser 
100 

CGC GGA ATC CCA 
Arg Gly He Pro 
115 

GTT ATT GAC ACT 
Val lie Asp Thr 
130 

AGA TCT GAA GAA 
Arg Ser Glu Glu 
145 



AAC CCG GAA 
Asn Pro Glu 
55 

CCA GTT TCA 
Pro Val Ser 
70 

GAT AAC TAC 
Asp Asn Tyr 
85 

ACT GAC CTG 
Thr Asp Leu 



ATT CCA 
He Pro 
25 

AAC AAA 
Asn Lys 
40 

GAA GGA 
Glu Gly 



TAT AAG GAC 
Tyr Lys Asp 
10 

AAC GCC GGC 
Asn Ala Gly 



ATC TGG GTT 
He Trp Val 



TAC TAC 
Tyr Tyr 



CTG AAG 
Leu Lys 



TTT TGG GGT 
Phe Trp Gly 



AAC TGC ATT 
Asn Cys He 
135 

CTT AAC CTC 
Leu Asn Leu 
150 



GGC CGT 
Gly Arg 
105 

GGC AGT 
Gly Ser 
120 

AAC GTG 
Asn Val 



GAC TTG AAC 
Asp Leu Asn 
60 

GAT TCA ACC 
Asp Ser Thr 
75 

GGA GTG ACC 
Gly Val Thr 
90 

ATG CTG CTG 
Met Leu Leu 



CCT GTA AAC GGT 
Pro Val Asn Gly 
15 

CAG ATG CAG CCG ° 
Gin Met Gin Pro 
30 

ATT CCG GAA CGC 
He Pro Glu Arg 
45 

CCG CCG CCG GAA 
Pro Pro Pro Glu 



ACC ATT GAC 
Thr He Asp 



ATC CAA CCA 
He Gin Pro 
140 



TAT CTG AGC ACA 
Tyr Leu Ser Thr 
80 

AAA TTA TTC GAG 
Lys Leu Phe Glu 
95 

ACC TCA ATC GTC 
Thr Ser He Val 
110 

ACG GAG TTG AAG 
Thr Glu Leu Lys 
125 

GAC GGT AGC TAC 
Asp Gly Ser Tyr 



ATC CAG TTT GAG 
He Gin Phe Glu 



CGT AAC GGT TAC 
Arg Asn Gly Tyr 
180 

ACG TTC GGT TTC 
Thr Phe Gly Phe 
195 

GGT GCA GGC AAG 
Gly Ala Gly Lys 
210 

CTG ATC CAC GCC 
Leu He His Ala 
225 

CGC GTG TTC AAG 
-Arg- Vai— Phe— hys~ 



TGC AAG AGC 
Cys Lys Ser 
165 

GGC TCT ACT 
Gly Ser Thr 



GAG GAG AGC 
Glu Glu Ser 



GAA GTA AGC TTC 
Glu Val Ser Phe 
260 



TTC GCA ACT 
Phe Ala Thr 
215 

GGT CAT CGT 
Gly His Arg 
230 

GTT AAC ACC 
-Val— Asn-Thr- 
245 

GAG GAA CTG 
Glu Glu Leu 



GTA ATC 
Val lie 



TTT GGC 
Phe Gly 



CAG TAC 
Gin Tyr 
185 

CTG GAG 
Leu Glu 
200 

GAT CCA 
Asp Pro 



ATC GGG CCC 
He Gly Pro 
155 

CAC GAA GTG 
His Glu Val 
170 

ATT CGT TTC 
He Arg Phe 



TCC 
Ser 



TTG 
Leu 



AGC 
Ser 



GTT GAT ACC 
Val Asp Thr 



CTG TAT 
Leu Tyr 



AAC GCC 

~Asn~Al-a~ 



CGC ACG 
Arg Thr 
265 



GCG GTG ACC 
Ala Val Thr 
220 

GGC ATT GCG 
Gly- He Ala 
235 

TAC TA C GAG 
"Tyr Tyr GTu~ 
250 

TTC GGT GGC 
Phe Gly Gly 



AAC 
Asn 
205 

CTG 
Leu 



GCG GAC ATT 
Ala Asp He 
160 

AAC CTG ACG 
Asn Leu Thr 
175 

CCA GAC TTC 
Pro Asp Phe 
190 

CCG CTG TTG 
Pro Leu Leu 



GCA CAC GAG 
Ala His Glu 



ATT AAC CCG AAC 
He Asn Pro Asn 
240 

ATG AGT GGT TTA 



"Me£~5er Gly Leu 
255 

CAT GAT GCG AAG 
His Asp Ala Lys 
270 



48 



96 



144 



192 



240 



288 



336 



384 



432 



480 



528 



576 



624 



672 



720 



768 



816 
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TTT ATC GAC AGC TTG CAG GAG AAC GAG TTC CGT CTG TAC TAC TAC AAC 

Phe lie Asp Ser Leu Gin Glu Asn Glu Phe Arg Leu Tyr Tyr Tyr Asn 

275 280 285 



B64 



AAG TTT AAA GAT ATT GCA AGT ACA CTG AAC AAG GCT AAG TCC ATT GTG 912 
Lys Phe Lys Asp lie Ala Ser Thr Leu Asn Lys Ala Lys Ser lie Val 
290 295 300 

GGT ACC ACT GCT TCA TTA CAG TAT ATG AAA AAT GTT TTT AAA GAG AAA 960 
Gly Thr Thr Ala Ser Leu Gin Tyr Met Lys Asn Val Phe Lys Glu Lys 
305 310 315 320 

TAT CTC CTA' TCT GAA GAT ACA TCT GGA AAA TTT TCG GTA GAT AAA TTA 1008 
Tyr Leu Leu Ser Glu Asp Thr Ser Gly Lys Phe Ser Val Asp Lys Leu 
325 330 335 

AAA TTT GAT AAG TTA TAC AAA ATG TTA ACA GAG ATT TAC ACA GAG GAT 10 56 

Lys Phe Asp Lys Leu Tyr Lys Met Leu Thr Glu lie Tyr Thr Glu Asp 
340 345 350 

AAT TTT GTT AAG TTT TTT AAA GTA CTT AAC AGA AAA ACA TAT TTG AAT 1104 
Asn Phe Val Lys Phe Phe Lys Val Leu Asn Arg Lys Thr Tyr Leu Asn 
355 360 365 

TTT GAT AAA GCC GTA TTT AAG ATA AAT ATA GTA CCT AAG GTA AAT TAC 1152 
Phe Asp Lys Ala Val Phe Lys lie Asn lie Val Pro Lys Val Asn Tyr 
370 375 380 

ACA ATA TAT GAT GGA TTT AAT TTA AGA AAT ACA AAT TTA GCA GCA AAC 120 0 

Thr lie Tyr Asp Gly Phe Asn Leu Arg Asn Thr Asn Leu Ala Ala Asn 
385 390 395 400 

TTT AAT GGT CAA AAT ACA GAA ATT AAT AAT ATG AAT TTT ACT AAA CTA 124 8 

Phe Asn Gly Gin Asn Thr Glu lie Asn Asn Met Asn Phe Thr Lys Leu 
405 410 415 

AAA AAT TTT ACT GGA TTG TTT GAA TTT TAT AAG TTG CTA TGT GTA AGA 12 96 

Lys Asn Phe Thr Gly Leu Phe Glu Phe Tyr Lys Leu Leu Cys Val Arg 
420 425 430 

GGG ATA ATA ACT TCT AAA ACT AAA TCA TTA GAT AAA GGA TAC AAT AAG 134 4 

Gly lie lie Thr Ser Lys Thr Lys Ser Leu Asp Lys Gly Tyr Asn Lys 
435 440 445 

ATC GAA GGT CGT TGC GAT GGG GCA TTA AAT GAT TTA TGT ATC AAA GTT 13 92 

lie Glu Gly Arg Cys Asp Gly Ala Leu Asn Asp Leu Cys lie Lys Val 
450 455 460 

AAT AAT TGG GAC TTG TTT TTT AGT CCT TCA GAA GAT AAT TTT ACT AAT 14 4 0 

Asn Asn Trp Asp Leu Phe Phe Ser Pro Ser Glu Asp Asn Phe Thr Asn 
465 470 475 480 

GAT CTA AAT AAA GGA GAA GAA ATT ACA TCT GAT ACT AAT ATA GAA GCA 14 8 8 

Asp Leu Asn Lys Gly Glu Glu lie Thr Ser Asp Thr Asn lie Glu Ala 
485 490 495 

GCA GAA GAA AAT ATT AGT TTA GAT TTA ATA CAA CAA TAT TAT TTA ACC 15 3 6 

Ala Glu Glu Asn lie Ser Leu Asp Leu lie Gin Gin Tyr Tyr Leu Thr 
500 505 510 

TTT AAT TTT GAT AAT GAA CCT GAA AAT ATT TCA ATA GAA AAT CTT TCA 1584 
Phe Asn Phe Asp Asn Glu Pro Glu Asn lie Ser lie Glu Asn Leu Ser 
515 520 525 

AGT GAC ATT ATA GGC CAA TTA GAA CTT ATG CCT AAT ATA GAA AGA TTT 16 32 

Ser Asp lie lie Gly Gin Leu Glu Leu Met Pro Asn lie Glu Arg Phe 
530 535 540 
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CCT AAT 
Pro Asn 
545 

CTT CGT 
Leu Arg 

AAT TCT 
Asn Ser 



TTT TCT 
Phe Ser 



ATG TTT 
Met Phe 
610 

ACT AGC 
Thr Ser 
625 

ATT CCA 
lie Pro 



GGA AAA AAG TAT GAG TTA GAT 

Gly Lys hys Tyr Glu Leu Asp 
550 

GCT CAA GAA TTT GAA CAT GGT 
Ala Gin Glu Phe Glu His Gly 
565 

GTT AAC GAA GCA TTA TTA AAT 
Val Asn Glu Ala Leu Leu Asn 
580 585 

"TCA GAC TAT GTA AAG AAA GTT 
Ser Asp Tyr Val Lys Lys Val 
595 6 00 

TTA GGC TGG GTA GAA CAA TTA 
Leu Gly Trp Val Glu Gin Leu 
615 



GAA GTA AGT ACT ACG 
Glu Val Ser Thr Thr 
630 



TAT ATA GGA CCT GCT TTA AAT 
Tyr He Gly Pro Ala Leu Asn 
645 



GAT GAT TTT GTA GGT GCT TTA 
Asp Asp Phe Val Gly Ala Leu 
660 

GAA TTT ATA CCA GAG ATT GCA 
Glu Phe He Pro Glu He Ala 
675 

GTA TCA TAT ATT GCG AAT AAG 
Val Ser Tyr He Ala Asn Lys 
690 695 

GCT TTA AGT AAA AGA AAT GAA 
Ala Leu Ser Lys Arg Asn Glu 
705 710 

GTA ACA AAT TGG TTA GCA AAG 
Val Thr Asn Trp Leu Ala Lys 
725 

AAA AAA ATG AAA GAA GCT TTA 
Lys Lys Met Lys Glu Ala Leu 
740 

ATA ATA AAC TAT CAG TAT AAT 
He He Asn Tyr Gin Tyr Asn 
755 

ATT AAT TTT AAT ATT GAT GAT 
He Asn Phe Asn He Asp Asp 
770 775 



AAA TAT ACT ATG TTC CAT TAT 
Lys Tyr Thr Met Phe His Tyr 
55 5 560 

AAA TCT AGG ATT GCT TTA ACA 
Lys Ser Arg He Ala Leu Thr 
570 575 

CCT AGT CGT GTT TAT ACA TTT 
Pro Ser Arg Val Tyr Thr Phe 
590 

AAT AAA GCT ACG GAG GCA GCT 
Asn Lys Ala Thr Glu Ala Ala 
605 

GTA TAT GAT TTT ACC GAT GAA 
Val Tyr Asp Phe Thr Asp Glu 
620 

GAT AAA ATT GCG GAT ATA ACT ATA ATT 
Asp Lys He Ala Asp He Thr He He 
635 640 

ATA GGT AAT ATG TTA TAT AAA 
He Gly Asn Met Leu Tyr Lys 
650 655 

TCA GGA GCT GTT ATT CTG TTA 
Ser Gly Ala Val He Leu Leu 
670 

GTA TTA GGT ACT TTT GCA CTT 
Val Leu Gly Thr Phe Ala Leu 
685 

ACC GTT CAA ACA ATA GAT AAT 
Thr Val Gin Thr He Asp Asn 
■?nn 

GAT GAG GTC TAT AAA TAT ATA 
Asp Glu Val Tyr Lys Tyr He 
715 720 



ATA TTT 
He Phe 
665 

ATA CCT 
He Pro 
680 

GTT CTA 
Val' Leu 



AAA TGG 
Lys Trp 



GTT AAT 
Val Asn 



GAA AAT 
Glu Asn 
745 

CAA TAT 
Gin Tyr 
760 

TTA AGT 
Leu Ser 



ACA CAG ATT GAT CTA ATA AGA 
Thr Gin He Asp Leu He Arg 
730 735 

CAA GCA GAA GCA ACA AAG GCT 
Gin Ala Glu Ala Thr Lys Ala 
750 

ACT GAG GAA GAG AAA AAT AAT 
Thr Glu Glu Glu Lys Asn Asn 
765 

TCG AAA CTT AAT GAG TCT ATA 
Ser Lys Leu Asn Glu Ser lie 
780 



_AAT— AAA_GCIL-AXG~ATT AAT— ATA- 



Asn Lys Ala Met lie Asn He 
785 790 

TCA TAT TTA ATG AAT TCT ATG 
Ser Tyr Leu Met Asn Ser Met 
805 



-AAT— AAA— TTT TTG ~ AAT~CAA~~TGC~TCT~GTT~ 
Asn Lys Phe Leu Asn Gin Cys Ser Val 
795 800 

ATC CCT TAT GGT GTT AAA CGG TTA GAA 
He Pro Tyr Gly Val Lys Arg Leu Glu 
810 815 



1680 



1728 



1776 



1824 
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2016 
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2400 



2448 



WO 98/07864 



PCT/GB97/02273 



- 70 - 

GAT TTT GAT GCT AGT CTT AAA GAT GCA TTA TTA AAG TAT ATA TAT GAT 2 4 96 

Asp Phe Asp Ala Ser Leu Lys Asp Ala Leu Leu Lys Tyr lie Tyr Asp 
820 825 830 

AAT AGA GGA ACT TTA ATT GGT CAA GTA GAT AGA TTA AAA GAT AAA GTT 2 54 4 

Asn Arg Gly Thr Leu lie Gly Gin Val Asp Arg Leu Lys Asp Lys Val 

B3S 840 845 

AAT AAT ACA CTT AGT ACA GAT ATA CCT TTT CAG CTT TCC AAA TAC GTA 25 92 

Asn Asn Thr Leu Ser Thr Asp lie Pro Phe Gin Leu Ser Lys Tyr Val 
850 855 860 

GAT AAT CAA AGA TTA TTA TCT ACA TTT ACT GAA TAT ATT AAG TCT AGG 2 64 0 

Asp Asn Gin Arg Leu Leu Ser Thr Phe Thr Glu Tyr lie Lys Ser Arg 
865 870 875 880 

CCT GGA CCG GAG ACG CTC TGC GGG GCT GAG CTG GTG GAT GCT CTT CAG 2 6 88 

Pro Gly Pro Glu Thr Leu Cys Gly Ala Glu Leu Val Asp Ala Leu Gin 
885 890 895 

TTC GTG TGT GGA GAC AGG GGC TTT TAT TTC AAC AAG CCC ACA GGG TAT 27 3 6 

Phe Val Cys Gly Asp Arg Gly Phe Tyr Phe Asn Lys Pro Thr Gly Tyr 
900 905 910 

GGC TCC AGC AGT CGG AGG GCG CCT CAG ACA GGT ATC GTG GAT GAG TGC 278 4 

Gly Ser Ser Ser Arg Arg Ala Pro Gin Thr Gly lie Val Asp Glu Cys 

915 920 925 

TGC TTC CGG AGC TGT GAT CTA AGG AGG CTG GAG ATG TAT TGC GCA CCC 2 83 2 

Cys Phe Arg Ser Cys Asp Leu Arg Arg Leu Glu Met Tyr Cys Ala Pro 
930 935 940 

CTC AAG CCT GCC AAG TCA GCT GAA GCT TAG 2 86 2 

Leu Lys Pro Ala Lys Ser Ala Glu Ala * 
945 950 



(2) INFORMATION FOR SEQ ID NO : 14: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 954 amino acids 

(B) TYPE: amino acid 
{ D ) TOPOLOGY: linear 

<ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 14: 

Met Gin Phe Val Asn Lys Gin Phe Asn Tyr Lys Asp Pro Val Asn Gly 
15 10 15 

Val Asp lie Ala Tyr lie Lys lie Pro Asn Ala Gly Gin Met Gin Pro 
20 25 30 

Val Lys Ala Phe Lys lie His Asn Lys lie Trp Val lie Pro Glu Arg 
35 40 45 

Asp Thr Phe Thr Asn Pro Glu Glu Gly Asp Leu Asn Pro Pro Pro Glu 
50 55 60 

Ala Lys Gin Val Pro Val Ser Tyr Tyr Asp Ser Thr Tyr Leu Ser Thr 
65 70 75 80 

Asp Asn Glu Lys Asp Asn Tyr Leu Lys Gly Val Thr Lys Leu Phe Glu 

85 90 95 

Arg lie Tyr Ser Thr Asp Leu Gly Arg Met Leu Leu Thr Ser lie Val 
100 105 110 



WO 98/07864 



PCT/GB97/02273 



- 71 - 

Arg Gly He Pro Phe Trp Gly Gly Ser Thr He Asp Thr Glu Leu Lvs 
115 120 125 

Val He Asp Thr Asn Cys He Asn Val He Gin Pro Asp Gly Ser Tvr 
130 135 140 

Arg Ser Glu Glu Leu Asn Leu Val He He Gly Pro Ser Ala Asp lie 
145 150 155 160 

He Gin Phe Glu Cys Lys Ser Phe Gly His Glu Val Leu Asn Leu Thr 
165 170 175 

Arg Asn Gly Tyr Gly Ser Thr Gin Tyr lie Arg Phe Ser Pro Asp Phe 
180 185 190 

Thr Phe Gly Phe Glu Glu Ser Leu Glu Val Asp Thr Asn Pro Leu Leu 
195 200 205 

Gly Ala Gly Lys Phe Ala Thr Asp Pro Ala Val Thr Leu Ala His Glu 
210 215 . 220 

Leu lie His Ala Gly His Arg Leu Tyr Gly He Ala He Asn Pro Asn 
225 230 235 240 

Arg Val Phe Lys Val Asn Thr Asn Ala Tyr Tyr Glu Met Ser Gly Leu 
245 250 255 

Glu Val Ser Phe Glu Glu Leu Arg Thr Phe Gly Gly His Asp Ala Lys 
260 265 270 

Phe He Asp Ser Leu Gin Glu Asn Glu Phe Arg Leu Tyr Tyr Tyr Asn 
275 280 285 

Lys Phe Lys Asp He Ala Ser Thr Leu Asn Lys Ala Lys Ser lie Val 
290 295 300 

Gly Thr Thr Ala Ser Leu Gin Tyr Met Lys Asn Val Phe Lys Glu Lys 
305 310 315 320 

i-n T_.. T <-< * —. _ * 

" -i - * — tr — — — V -*J j >-» a — ^. vax nop i-»y £» iiCU 

325 330 335 

Lys Phe Asp Lys Leu Tyr Lys Met Leu Thr Glu lie Tyr Thr Glu Asp 
340 345 350 

Asn Phe Val Lys Phe Phe Lys Val Leu Asn Arg Lys Thr Tyr Leu Asn 
355 360 365 

Phe Asp Lys Ala Val Phe Lys lie Asn lie Val Pro Lys Val Asn Tyr 
370 375 380 

Thr lie Tyr Asp Gly Phe Asn Leu Arg Asn Thr Asn Leu Ala Ala Asn 
385 390 395 400 

Phe Asn Gly Gin Asn Thr Glu lie Asn Asn Met Asn Phe Thr Lys Leu 
405 410 415 

Lys Asn Phe Thr Gly Leu Phe Glu Phe Tyr Lys Leu Leu Cys Val Arg 
420 425 430 

Gly He l i e Thr Ser Lys Thr L y s Ser Leu Asp Lys Gly Tyr As n Lvs 



4"3~5 440 445" 



lie Glu Gly Arg Cys Asp Gly Ala Leu Asn Asp Leu Cys lie Lys Val 
450 455 460 
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Asn Asn Trp Asp Leu Phe Phe Ser Pro Ser Glu Asp Asn Phe Thr Asn 
465 470 475 480 * 

Asp Leu Asn Lys Gly Glu Glu He Thr Ser Asp Thr Asn He Glu Ala 
485 490 495 

Ala Glu Glu Asn He Ser Leu Asp Leu He Gin Gin Tyr Tyr Leu Th- 
500 505 510 

Phe Asn Phe Asp Asn Glu Pro Glu Asn He Ser He Glu Asn Leu Ser 
515 520 525 

Ser Asp He' He Gly Gin Leu Glu Leu Met Pro Asn He Glu Arg Phe 
530 535 540 

Pro Asn Gly Lys Lys Tyr Glu Leu Asp Lys Tyr Thr Met Phe His Tyr 
545 550 555 560 

Leu Arg Ala Gin Glu Phe Glu His Gly Lys Ser Arg He Ala Leu Thr 
565 570 * 575 

Asn Ser Val Asn Glu Ala Leu Leu Asn Pro Ser Arg Val Tyr Thr Phe 
580 585 590 

Phe Ser Ser Asp Tyr Val Lys Lys Val Asn Lys Ala Thr Glu Ala Ala 
595 600 . 605 

Met Phe Leu Gly Trp Val Glu Gin Leu Val Tyr Asd Phe Thr Asp Glu 
610 615 620 

Thr Ser Glu Val Ser Thr Thr Asp Lys He Ala Asp He Thr He He 
625 630 635 640 

He Pro Tyr He Gly Pro Ala Leu Asn He Gly Asn Met Leu Tyr Lys 
645 650 655 

Asp Asp Phe Val Gly Ala Leu He Phe Ser Gly Ala Val He Leu Leu 
660 665 670 

Glu Phe He Pro Glu He Ala He Pro Val Leu Gly Thr Phe Ala Leu 
675 680 685 

Val Ser Tyr He Ala Asn Lys Val Leu Thr Val Gin Thr He Asp Asn 
690 695 700 

Ala Leu Ser Lys Arg Asn Glu Lys Trp Asp Glu Val Tyr Lys Tyr He 
70S 710 715 720 

Val Thr Asn Trp Leu Ala Lys Val Asn Thr Gin He Asp Leu He Arg 
725 730 735 

Lys Lys Met Lys Glu Ala Leu Glu Asn Gin Ala Glu Ala Thr Lys Ala 
740 745 750 

He He Asn Tyr Gin Tyr Asn Gin Tyr Thr Glu Glu Glu Lys Asn Asn 
755 760 765 

He Asn Phe Asn He Asp Asp Leu Ser Ser Lys Leu Asn Glu Ser He 
770 775 780 

Asn Lys Ala Met He Asn He Asn Lys Phe Leu Asn Gin Cys Ser Val 
785 790 795 800 

Ser Tyr Leu Met Asn Ser Met He Pro Tyr Gly Val Lys Arg Leu Glu 
805 810 815 
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Asp Phe Asp Ala Ser Leu Lys Asp Ala Leu Leu Lys Tyr He Tyr Asp 
820 825 830 

Asn Arg Gly Thr Leu He Gly Gin Val Asp Arg Leu Lys Asp Lys Val 
835 840 845 

Asn Asn Thr Leu Ser Thr Asp He Pro Phe Gin Leu Ser Lys Tyr Val 
850 855 860 

Asp Asn Gin Arg Leu Leu Ser Thr Phe Thr Glu Tyr He Lys Ser Arq 
865 870 875 8 80 

Pro Gly -Pro Glu Thr Leu Cys Gly Ala Glu Leu Val Asp Ala Leu Gin 
885 890 895 

Phe Val Cys Gly Asp Arg Gly Phe Tyr Phe Asn Lys Pro Thr Gly Tyr 
900 905 910 

Gly Ser Ser Ser Arg Arg Ala Pro Gin Thr Gly lie Val Asp Glu Cys 
915 920 925 

Cys Phe Arg Ser Cys Asp Leu Arg Arg Leu Glu Met Tyr Cys Ala Pro 
930 935 940 

Leu Lys Pro Ala Lys Ser Ala Glu Ala * 
945 950 

(2) INFORMATION FOR SEQ ID NO: 15: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2724 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY; linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(ix) FEATURE: 

(A) NAME / KEY : CDS 



(Xi) SEQUENCE DESCRIPTION : SEQ ID NO: 15: 

ATG CAG TTC GTG AAC AAG CAG TTC AAC TAT AAG GAC CCT GTA AAC GGT 4 8 

Met Gin Phe Val Asn Lys Gin Phe Asn Tyr Lys Asp Pro Val Asn Gly 
15 10 15 

GTT GAC ATT GCC TAC ATC AAA ATT CCA AAC GCC GGC CAG ATG CAG CCG 96 
Val Asp He Ala Tyr He Lys He Pro Asn Ala Gly Gin Met Gin Pro 
20 25 30 

GTG AAG GCT TTC AAG ATT CAT AAC AAA ATC TGG GTT ATT CCG GAA CGC 14 4 

Val Lys Ala Phe Lys He His Asn Lys He Trp Val He Pro Glu Arg 
35 40 45 

GAT ACA TTT ACG AAC CCG GAA GAA GGA GAC TTG AAC CCG CCG CCG GAA 192 
Asp Thr Phe Thr Asn Pro Glu Glu Gly Asp Leu Asn Pro Pro Pro Glu 
50 55 60 

G CA AAG CAG GTG CCA GTT TCA TAC TAC GAT TCA ACC TAT CTG AGC ACA 2 4 0 

Ala Lys Gin Val Pro Val Ser Tyr Tyr Asp Ser Thr Tyr Leu Ser Thr 
65 70 75 80 



GAC AAC GAG AAG GAT AAC TAC CTG AAG GGA GTG ACC AAA TTA TTC GAG 
Asp Asn Glu Lys Asp Asn Tyr Leu Lys Gly Val Thr Lvs Leu Phe Glu 
85 90 * 95 



288 
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CGT ATT TAT TCC ACT GAC CTG GGC CGT ATG CTG CTG ACC TCA ATC GTC 336 
Arg lie Tyr Ser Thr Asp Leu Gly Arg Met Leu Leu Thr Ser lie Val 
100 105 110 

CGC GGA ATC CCA TTT TGG GGT GGC AGT ACC ATT GAC ACG GAG TTG AAG 3 84 

Arg Gly lie Pro Phe Trp Gly Gly Ser Thr lie Asp Thr Glu Leu Lys 
115 120 125 

GTT ATT GAC ACT AAC TGC ATT AAC GTG ATC CAA CCA GAC GGT AGC TAC 432 
Val lie Asp Thr Asn Cys lie Asn Val lie Gin Pro Asp Gly Ser Tyr 
130 135 140 

AGA TCT GAA GAA CTT AAC CTC GTA ATC ATC GGG CCC TCC GCG GAC ATT 4 80 

Arg Ser Glu Glu Leu Asn Leu Val lie lie Glv Pro Ser Ala Asp lie 
145 150 155 160 

ATC CAG TTT GAG TGC AAG AGC TTT GGC CAC GAA GTG TTG AAC CTG ACG 52 8 

lie Gin Phe Glu Cys Lys Ser Phe Gly His Glu Val Leu Asn Leu Thr 
165 * 170 175 

CGT AAC GGT TAC GGC TCT ACT CAG TAC ATT CGT TTC AGC CCA GAC TTC 576 
Arg Asn Gly Tyr Gly Ser Thr Gin Tyr lie Arg Phe Ser Pro Asn Phe 
180 135 190 

ACG TTC GGT TTC GAG GAG AGC CTG GAG GTT GAT ACC AAC CCG CTG TTG 6 24 

Thr Phe Gly Phe Glu Glu Ser Leu Glu Val Asp Thr Asn Pro Leu Leu 
195 200 205 

GGT GCA GGC AAG TTC GCA ACT GAT CCA GCG GTG ACC CTG GCA CAC GAG 6 72 

Gly Ala Gly Lys Phe Ala Thr Asp Pro Ala Val Thr Leu Ala His Glu 
210 215 220 

CTG ATC CAC GCC GGT CAT CGT CTG TAT GGC ATT GCG ATT AAC CCG AAC 72 0 

Leu lie His Ala Gly His Arg Leu Tyr Gly He Ala He Asn Pro Asn 
225 230 235 240 

CGC GTG TTC AAG GTT AAC ACC AAC GCC TAC TAC GAG ATG AGT GGT TTA 76 8 

Arg Val Phe Lys Val Asn Thr Asn Ala Tyr Tyr Glu Met Ser Gly Leu 
245 250 255 

GAA GTA AGC TTC GAG GAA CTG CGC ACG TTC GGT GGC CAT GAT GCG AAG 816 
Glu Val Ser Phe Glu Glu Leu Arg Thr Phe Gly Gly His Asp Ala Lvs 
260 265 270 

TTT ATC GAC AGC TTG CAG GAG AAC GAG TTC CGT CTG TAC TAC TAC AAC 864 
Phe He Asp Ser Leu Gin Glu Asn Glu Phe Arg Leu Tyr Tyr Tyr Asn 
275 280 285 

AAG TTT AAA GAT ATT GCA AGT ACA CTG AAC AAG GCT AAG TCC ATT GTG 912 
Lys Phe Lys Asp He Ala Ser Thr Leu Asn Lys Ala Lys Ser He Val 
290 295 300 

GGT ACC ACT GCT TCA TTA CAG TAT ATG AAA AAT GTT TTT AAA GAG AAA 96 0 

Gly Thr Thr Ala Ser Leu Gin Tyr Met Lys Asn Val Phe Lys Glu Lys 
305 310 315 320 

TAT CTC CTA TCT GAA GAT ACA TCT GGA AAA TTT TCG GTA GAT AAA TTA 100 8 

Tyr Leu Leu Ser Glu Asp Thr Ser Gly Lys Phe Ser Val Asp Lys Leu 
325 330 335 

AAA TTT GAT AAG TTA TAC AAA ATG TTA ACA GAG ATT TAC ACA GAG GAT 1056 
Lys Phe Asp Lys Leu Tyr Lys Met "Leu Thr Glu He Tyr Thr Glu Asp 
340 345 350 

AAT TTT GTT AAG TTT TTT AAA GTA CTT AAC AGA AAA ACA TAT TTG AAT 1104 
Asn Phe Val Lvs Phe Phe Lys Val Leu Asn Arg Lys Thr Tyr Leu Asn 
355 * 360 365 
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TTT GAT AAA GCC GTA TTT AAG ATA AAT ATA GTA CCT AAG GTA AAT TAC ncn 
Phe Asp Lys Ala Val Phe Lys He Asn He Val Pro Lys Val Asn Tyr 
37t > 375 380 

ACA ATA TAT GAT GGA TTT AAT TTA AGA AAT ACA AAT TTA GCA GCA AAC ^200 
Thr lie Tyr Asp Gly Phe Asn Leu Arg Asn Thr Asn Leu Ala Ala Asn 
385 390 395 400 

TTT AAT GGT CAA AAT ACA GAA ATT AAT AAT ATG AAT TTT ACT AAA CTA 1248 
Phe Asn Gly Gin Asn Thr Glu He Asn Asn Met Asn Phe Thr Lys Leu 
405 410 415 

AAA AAT >TTT ACT GGA TTG TTT GAA TTT TAT AAG TTG CTA TGT GTA AGA 1296 
Lys Asn Phe Thr Gly Leu Phe Glu Phe Tyr Lys Leu Leu Cys Val Arg 
420 425 430 

GGG ATA ATA ACT TCT AAA ACT AAA TCA TTA GAT AAA GGA TAC AAT AAG 1344 
Gly He He Thr Ser Lys Thr Lys Ser Leu Asp Lys Gly Tyr Asn Lys 
435 440 445 

ATC GAA GGT CGT TGC GAT GGG GCA TTA AAT GAT TTA TGT ATC AAA GTT 13 92 

He Glu Gly Arg Cys Asp Gly Ala Leu Asn Asp Leu Cys He Lys Val 
450 455 460 

AAT AAT TGG GAC TTG TTT TTT AGT CCT TCA GAA GAT AAT TTT ACT AAT 1440 
Asn Asn Trp Asp Leu Phe Phe Ser Pro Ser Glu Asp Asn Phe Thr Asn 
465 470 475 480 

GAT CTA AAT AAA GGA GAA GAA ATT ACA TCT GAT ACT AAT ATA GAA GCA 14 88 

Asp Leu Asn Lys Gly Glu Glu He Thr Ser Asp Thr Asn He Glu Ala 
485 490 495 

GCA GAA GAA AAT ATT AGT TTA GAT TTA ATA CAA CAA TAT TAT TTA ACC 1536 
Ala Glu Glu Asn He Ser Leu Asp Leu He Gin Gin Tyr Tyr Leu Thr 
500 505 510 

TTT AAT TTT GAT AAT GAA CCT GAA AAT ATT TCA ATA GAA AAT CTT TCA IS 84 

Phe Asn Phe Asp Asn Glu Pro Glu Asn He Ser He Glu Asn Leu Ser 
515 520 525 

AGT GAC ATT ATA GGC CAA TTA GAA CTT ATG CCT AAT ATA GAA AGA TTT 1632 
Ser Asp He lie Gly Gin Leu Glu Leu Met Pro Asn He Glu Arg Phe 
530 535 540 

CCT AAT GGA AAA AAG TAT GAG TTA GAT AAA TAT ACT ATG TTC CAT TAT 1680 
Pro Asn Gly Lys Lys Tyr Glu Leu Asp Lys Tyr Thr Met Phe His Tyr 
545 550 555 560 

CTT CGT GCT CAA GAA TTT GAA CAT GGT AAA TCT AGG ATT GCT TTA ACA 172 8 

Leu Arg Ala Gin Glu Phe Glu His Gly Lys Ser Arg He Ala Leu Thr 
565 570 575 

AAT TCT GTT AAC GAA GCA TTA TTA AAT CCT AGT CGT GTT TAT ACA TTT 1776 
Asn Ser Val Asn Glu Ala Leu Leu Asn Pro Ser Arg Val Tyr Thr Phe 
580 585 590 

TTT TCT TCA GAC TAT GTA AAG AAA GTT AAT AAA GCT ACG GAG GCA GCT 182 4 

Phe Ser Ser Asp Tyr Val Lys Lys Val Asn Lys Ala Thr Glu Ala Ala 
595 600 60S 

ATG TTT TTA GGC T ^_GTA---GAA„CAAJrTA-.GTA TAT GAT-JT-TT— ACC— GAT— GAA 1-8-75 

Met Phe Leu Gly Trp Val Glu Gin Leu Val Tyr Asp Phe Thr Asp Glu 
610 615 620 

ACT AGC GAA GTA AGT ACT ACG GAT AAA ATT GCG GAT ATA ACT ATA ATT 192 0 

Thr Ser Glu Val Ser Thr Thr Asp Lys He Ala Asp He Thr He lie 
625 630 635 640 
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ATT CCA TAT ATA GGA CCT GCT TTA AAT ATA GGT AAT ATG TTA TAT AAA 196 8 

He Pro Tyr He Gly Pro Ala Leu Asn He Gly Asn Met Leu Tyr Lys 

645 650 655 

GAT GAT TTT GTA GGT GCT TTA ATA TTT TCA GGA GCT GTT ATT CTG TTA 2 016 

Asp Asp Phe Val Gly Ala Leu He Phe Ser Gly Ala Val He Leu Leu 
660 665 670 

GAA TTT ATA CCA GAG ATT GCA ATA CCT GTA TTA GGT ACT TTT GCA CTT 20 64 

Glu Phe He Pro Glu He Ala He Pro Val Leu Gly Thr Phe Ala Leu 
675 680 685 

GTA TCA TAT ATT GCG AAT AAG GTT CTA ACC GTT CAA AC A ATA GAT AAT 2112 
Val Ser Tyr He Ala Asn Lys Val Leu Thr Val Gin Thr He Asp Asn 
690 695 700 

GCT TTA AGT AAA AGA AAT GAA AAA TGG GAT GAG GTC TAT AAA TAT ATA 2160 
Ala Leu Ser Lys Arg Asn Glu Lys Trp Asp Glu Val Tyr Lys Tyr He 
705 710 715 720 

GTA AC A AAT TGG TTA GCA AAG GTT AAT ACA CAG ATT GAT CTA ATA AGA 220 8 

Val Thr Asn Trp Leu Ala Lvs Val Asn Thr Gin He Asp Leu He Arg 

725 730 ' 735 

AAA AAA ATG AAA GAA GCT TTA GAA AAT CAA GCA GAA GCA ACA AAG GCT 2 256 

Lys Lys Met Lys Glu Ala Leu Glu Asn Gin Ala Glu Ala Thr Lys Ala 
740 745 750 

ATA ATA AAC TAT CAG TAT AAT CAA TAT ACT GAG GAA GAG AAA AAT AAT 2 3 04 

He He Asn Tyr Gin Tyr Asn Gin Tyr Thr Glu Glu Glu Lys Asn Asn 
755 760 765 

ATT AAT TTT AAT ATT GAT GAT TTA AGT TCG AAA CTT AAT GAG TCT ATA 23 52 

He Asn Phe Asn He Asp Asp Leu Ser Ser Lys Leu Asn Glu Ser He 
770 775 780 

AAT AAA GCT ATG ATT AAT ATA AAT AAA TTT TTG AAT CAA TGC TCT GTT 2 4 00 

Asn Lys Ala Met lie Asn He Asn Lys Phe Leu Asn Gin Cys Ser Val 
785 790 795 800 

TCA TAT TTA ATG AAT TCT ATG ATC CCT TAT GGT GTT AAA CGG TTA GAA 244 8 

Ser Tyr Leu Met Asn Ser Met He Pro Tyr Gly Val Lys Arg Leu Glu 
805 810 815 

GAT TTT GAT GCT AGT CTT AAA GAT GCA TTA TTA AAG TAT ATA TAT GAT 24 96 

Asp Phe Asp Ala Ser Leu Lys Asp Ala Leu Leu Lys Tyr He Tyr Asp 
820 825 830 

AAT AGA GGA ACT TTA ATT GGT CAA GTA GAT AGA TTA AAA GAT AAA GTT 2 54 4 

Asn Arg Gly Thr Leu He Gly Gin Val Asp Arg Leu Lys Asp Lys Val 
835 840 845 

AAT AAT ACA CTT AGT ACA GAT ATA CCT TTT CAG CTT TCC AAA TAC GTA 2 5 92 

Asn Asn Thr Leu Ser Thr Asp He Pro Phe Gin Leu Ser Lys Tyr Val 
850 855 860 

GAT AAT CAA AGA TTA TTA TCT ACA TTT ACT GAA TAT ATT AAG TCT AGG 2 64 0 

Asp Asn Gin Arg Leu Leu Ser Thr Phe Thr Glu Tyr He Lys Ser Arg 
865 870 875 880 

CCT CAA TCT AAA GTT AAA AGA CAA ATA TTT TCA GGC TAT CAA TCT GAT 26 8 8 

Pro Gin Ser Lys Val Lys Arg Gin He Phe Ser Gly Tyr Gin Ser Asp 
885 890 895 

ATT GAT ACA CAT AAT AGA ATT AAG GAT GAA TTA TGA 27 2 4 

He Asp Thr His Asn Arg He Lys Asp Glu Leu * 
900 905 
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(2) INFORMATION FOR SEQ ID NO: 16: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 908 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

< ii ) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16: 

Mec Gin Phe Val Asn Lys Gin Phe Asn Tyr Lys Asp Pro Val Asn Glv 
1-5 10 15 

Val Asp lie Ala Tyr lie Lys lie Pro Asn Ala Gly Gin Met Gin Pro 
20 25 30 

Val Lvs Ala Phe Lys lie His Asn Lys He Trp Val He Pro Glu Arg 
35 40 . 45 

Asp Thr Phe Thr Asn Pro Glu Glu Gly Asp Leu Asn Pro Pro Pro Glu 
50 55 60 

Ala Lys Gin Val Pro Val Ser Tyr Tyr Asp Ser Thr Tyr Leu Ser Thr 
65 70 " 75 80 

Asp Asn Glu Lys Asp Asn Tyr Leu Lys Gly Val Thr Lys Leu Phe Glu 
85 90 95 

Arg He Tyr Ser Thr Asp Leu Gly Arg Met Leu Leu Thr Ser He Val 
100 105 110 

Ara Gly He Pro Phe Trp Gly Gly Ser Thr lie Asp Thr Glu Leu Lys 
115 120 125 

Val He Asp Thr Asn Cys He Asn Val lie Gin Pro Asp Gly Ser Tyr 
130 135 140 

Arg Ser Glu Glu Leu Asn Leu Val He lie Gly Pro Ser Ala Asp lie 

lie Gin Phe Glu Cys Lys Ser Phe Gly His Glu Val Leu Asn Leu Thr 
165 170 175 

Arg Asn Gly Tyr Gly Ser Thr Gin Tyr He Arg Phe Ser Pro Asp Phe 
180 185 190 

Thr Phe Gly Phe Glu Glu Ser Leu Glu Val Asp Thr Asn Pro Leu Leu 
195 200 205 

Gly Ala Gly Lys Phe Ala Thr Asp Pro Ala Val Thr Leu Ala His Glu 
210 215 220 

Leu lie His Ala Gly His Arg Leu Tyr Gly lie Ala lie Asn Pro Asn 
225 230 235 240 

Arg Val Phe Lys Val Asn Thr Asn Ala Tyr Tyr Glu Met Ser Gly Leu 
245 250 255 

Glu Val Ser Phe Glu Glu Leu Arg Thr Phe Gly Gly His Asp Ala Lys 

2i£ 265 270 



Phe He Asp Ser Leu Gin Glu Asn Glu Phe Arg Leu Tyr Tyr Tyr Asn 

275 280 285 

Lys Phe Lys Asp lie Ala Ser Thr Leu Asn Lys Ala Lys Ser lie Val 
290 295 300 
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Gly Thr Thr Ala Ser Leu Gin Tyr Met Lys Asn Val Phe Lys Glu Lvs 
305 310 315 320 

Tyr Leu Leu Ser Glu Asp Thr Ser Gly Lys Phe Ser Val Asp Lys Leu 
325 330 335 

Lys Phe Asp Lys Leu Tyr Lys Met Leu Thr Glu He Tyr Thr Glu Asd 
340 345 350 

Asn Phe Val Lys Phe Phe Lys Val Leu Asn Arg Lys Thr Tyr Leu Asn 
355 360 365 

Phe Asp Lys Ala Val Phe Lys lie Asn He Val Pro Lys Val Asn Tyr 
370 375 380 

Thr He Tyr Asp Gly Phe Asn Leu Arg Asn Thr Asn Leu Ala Ala Asn 
385 390 395 400 

Phe Asn Gly Gin Asn Thr Glu lie Asn Asn Met Asn Phe Thr Lys Leu 
405 410 415 

Lys Asn Phe Thr Gly Leu Phe Glu Phe Tyr Lys Leu Leu Cys Val Arg 
420 425 430 

Gly He He Thr Ser Lys Thr Lys Ser Leu Asp Lys Gly Tyr Asn Lys 
435 440 445 

He Glu Gly Arg Cys Asp Gly Ala Leu Asn Asp Leu Cys He Lys Val 
450 455 460 

Asn Asn Trp Asp Leu Phe Phe Ser Pro Ser Glu Asp Asn Phe Thr Asn 
465 470 475 480 

Asp Leu Asn Lys Gly Glu Glu He Thr Ser Asp Thr Asn He Glu Ala 
4 85 4 90 4 95 

Ala Glu Glu Asn He Ser Leu Asp Leu He Gin Gin Tyr Tyr Leu Thr 
500 505 510 

Phe Asn Phe Asp Asn Glu Pro Glu Asn He Ser He Glu Asn Leu Ser 
515 520 525 

Ser Asp He He Gly Gin Leu Glu Leu Met Pro Asn He Glu Arg Phe 
530 535 540 

Pro Asn Gly Lys Lys Tyr Glu Leu Asp Lys Tyr Thr Met Phe His Tyr 
545 550 555 560 

Leu Arg Ala Gin Glu Phe Glu His Gly Lys Ser Arg He Ala Leu Thr 
565 570 575 

Asn Ser Val Asn Glu Ala Leu Leu Asn Pro Ser Arg Val Tyr Thr Phe 
580 585 590 

Phe Ser Ser Asp Tyr Val Lys Lys Val Asn Lys Ala Thr Glu Ala Ala 
595 600 605 

Met Phe Leu Gly Trp Val Glu Gin Leu Val Tyr Asp Phe Thr Asp Glu 
610 615 620 

Thr Ser Glu Val Ser Thr Thr Asp Lys He Ala Asp He Thr He He 
625 630 635 640 

He Pro Tyr He Gly Pro Ala Leu Asn He Gly Asn Met Leu Tyr Lys 
645 650 655 
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Asp Asp Phe Val Gly Ala Leu He Phe Ser Gly Ala Val He Leu Leu 
660 665 670 

Glu Phe He Pro Glu He Ala He Pro Val Leu Gly Thr Phe Ala Leu 
67 5 680 ses 

Val Ser Tyr He Ala Asn Lys Val Leu Thr Val Gin Thr He Asn Asn 
690 695 700 

Ala Leu Ser Lys Arg Asn Glu Lys Trp Asp Glu Val Tyr Lys Tyr lie 
705 710 715 720 

Val Thr, Asn Trp Leu Ala Lys Val Asn Thr Gin He Asp Leu He Arc 
725 730 735 * 

Lys Lys Met Lys Glu Ala Leu Glu Asn Gin Ala Glu Ala Thr Lys Ala 
740 745 750 

lie lie Asn Tyr Gin Tyr Asn Gin Tyr Thr Glu Glu Glu Lys Asn Asn 
755 760 765 

lie Asn Phe Asn lie Asp Asp Leu Ser Ser Lys Leu Asn Glu Ser He 
770 775 780 

Asn Lys Ala Met lie Asn lie Asn Lys Phe Leu Asn Gin Cys Ser Val 
785 790 795 800 

Ser Tyr Leu Met Asn Ser Met lie Pro Tyr Gly Val Lys Arg Leu Glu 
805 aio 815 

Asp Phe Asp Ala Ser Leu Lys Asp Ala Leu Leu Lys Tyr lie Tyr Asp 
820 825 830 

Asn Arg Gly Thr Leu lie Gly Gin Val Asp Arg Leu Lys Asp Lys Val 
835 840 845 

Asn Asn Thr Leu Ser Thr Asp He Pro Phe Gin Leu Ser Lys Tyr Val 
850 855 860 

Asn Asn din A yrr T.*»n T.on C**>- tv.v ~ tm r*•^ .. m ^ _ , _ 

865 870 8 75 * ' 880 

Pro Gin Ser Lys Val Lys Arg Gin lie Phe Ser Gly Tyr Gin Ser Asp 
885 890 895 

lie Asp Thr His Asn Arg lie Lys Asp Glu Leu * 
900 905 



(2) INFORMATION FOR SEQ ID NO: 17: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3042 base pairs 

(B) TYPE: nucleic acid 

(C) STRAND ED NESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(ix) FEATURE: 

(A) NAME/ KEY: CDS 

(B) LOCATION : 1 , .3042 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17: 



ATG CAG TTC GTG AAC AAG CAG TTC AAC TAT AAG GAC CCT GTA AAC GGT 
Met Gin Phe Val Asn Lys Gin Phe Asn Tyr Lys Asp Pro Val Asn Gly 
1 5 io 15 



48 
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GTT GAC ATT GCC TAC ATC AAA ATT CCA AAC GCC GGC CAG ATG CAG CCG 9 6 

Val Asp lie Ala Tyr lie Lys lie Pro Asn Ala Gly Gin Met Gin Pro 
20 25 30 

GTG AAG GCT TTC AAG ATT CAT AAC AAA ATC TGG GTT ATT CCG GAA CGC 14 4 

Val Lys Ala Phe Lys lie His Asn Lys lie Trp Val lie Pro Glu Arg 
35 40 45 

GAT ACA TTT ACG AAC CCG GAA GAA GGA GAC TTG AAC CCG CCG CCG GAA 192 
Asp Thr Phe Thr Asn Pro Glu Glu Gly Asp Leu Asn Pro Pro Pro Glu 
50 55 60 

GCA AAG CAG GTG CCA GTT TCA TAC TAC GAT TCA ACC TAT CTG AGC ACA 24 0 

Ala Lys Gin Val Pro Val Ser Tyr Tyr Asp Ser Thr Tvr Leu Ser Thr 
65 70 75 80 

GAC AAC GAG AAG GAT AAC TAC CTG AAG GGA GTG ACC AAA TTA TTC GAG 2 88 

Asp Asn Glu Lys Asp Asn Tyr Leu Lys Gly Val Thr Lys Leu Phe Glu 

85 90 95 

CGT ATT TAT TCC ACT GAC CTG GGC CGT ATG CTG CTG ACC TCA ATC GTC 33 6 

Arg lie Tyr Ser Thr Asp Leu Gly Arg Met Leu Leu Thr Ser lie Val 
100 105 110 

CGC GGA ATC CCA TTT TGG GGT GGC AGT ACC ATT GAC ACG GAG TTG AAG 3 84 

Arg Gly lie Pro Phe Trp Gly Gly Ser Thr lie Asp Thr Glu Leu Lys 
115 120 125 

GTT ATT GAC ACT AAC TGC ATT AAC GTG ATC CAA CCA GAC GGT AGC TAC 43 2 

Val lie Asp Thr Asn Cys lie Asn Val lie Gin Pro Asp Gly Ser Tyr 
130 135 140 

AGA TCT GAA GAA CTT AAC CTC GTA ATC ATC GGG CCC TCC GCG GAC ATT 4 80 

Arg Ser Glu Glu Leu Asn Leu Val lie lie Gly Pro Ser Ala Asp lie 
145 150 155 160 

ATC CAG TTT GAG TGC AAG AGC TTT GGC CAC GAA GTG TTG AAC CTG ACG 52 8 

He Gin Phe Glu Cys Lys Ser Phe Gly His Glu Val Leu Asn Leu Thr 
165 170 175 

CGT AAC GGT TAC GGC TCT ACT CAG TAC ATT CGT TTC AGC CCA GAC TTC 57 6 

Arg Asn Gly Tyr Gly Ser Thr Gin Tyr He Arg Phe Ser Pro Asp Phe 
180 185 190 

ACG TTC GGT TTC GAG GAG AGC CTG GAG GTT GAT ACC AAC CCG CTG TTG 624 
Thr Phe Gly Phe Glu Glu Ser Leu Glu Val Asp Thr Asn Pro Leu Leu 
195 200 205 

GGT GCA GGC AAG TTC GCA ACT GAT CCA GCG GTG ACC CTG GCA CAC GAG 672 
Gly Ala Gly Lys Phe Ala Thr Asp Pro Ala Val Thr Leu Ala His Glu 
210 215 220 

CTG ATC CAC GCC GGT CAT CGT CTG TAT GGC ATT GCG ATT AAC CCG AAC 720 
Leu He His Ala Gly His Arg Leu Tyr Gly He Ala He Asn Pro Asn 
225 230 235 240 

CGC GTG TTC AAG GTT AAC ACC AAC GCC TAC TAC GAG ATG AGT GGT TTA 76 8 

Arg Val Phe Lys Val Asn Thr Asn Ala Tyr Tyr Glu Met Ser Gly Leu 
245 250 255 

GAA GTA AGC TTC GAG GAA CTG CGC ACG TTC GGT GGC CAT GAT GCG AAG 816 
Glu Val Ser Phe Glu Glu Leu Arg Thr Phe Gly Gly His Asp Ala Lys 
260 265 270 

TTT ATC GAC AGC TTG CAG GAG AAC GAG TTC CGT CTG TAC TAC TAC AAC 864 
Phe He Asp Ser Leu Gin Glu Asn Glu Phe Arg Leu Tyr Tyr Tyr Asn 
275 280 285 
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AAG TTT AAA 
Lys Phe Lys 
290 

GGT ACC ACT 
Gly Thr Thr 
305 

TAT CTC CTA 
Tyr Leu Leu 



GAT 
Asp 



GCT 
Ala 



TCT 
Ser 



AAA TTT- GAT 
Lys Phe Asp 



AAT TTT GTT 
Asn Phe Val 
355 

TTT GAT AAA 
Phe Asp Lys 
370 

ACA ATA TAT 
Thr He Tyr 
385 

TTT AAT GGT 
Phe Asn Glv 



AAG 
Lys 
340 

AAG 
Lys 



ATT 
He 



TCA 
Ser 



GAA 
Glu 
325 

TTA 
Leu 



GCA AGT 
Ala Ser 
295 

TTA CAG 
Leu Gin 
310 

GAT ACA 
Asp Thr 



ACA CTG 
Thr Leu 



AAC AAG 

Asn Lys 



TAT ATG 
Tyr Met 



TCT GGA 
Ser Gly 



TAC AAA 
Tyr Lys 



TTT TTT AAA 
Phe Phe Lys 



GCC GTA TTT 
Ala Val Phe 



GAT GGA TTT 
Asp Gly Phe 
390 

CAA AAT ACA 
Gin Asn Thr 
405 



AAG 
Lys 
375 

AAT 
Asn 



GAA 
Glu 



ATG TTA 
Met Leu 
345 

GTA CTT 
Val Leu 
360 

ATA AAT 
He Asn 



AAA AAT 
Lys Asn 
315 

AAA TTT 
Lys Phe 
330 

ACA GAG 
Thr Glu 



GCT 
Ala 
300 

GTT 
Val 



AAG TCC ATT GTG 
Lys Ser He Val 



TCG 
Ser 



ATT 
lie 



AAC AGA 
Asn Arg 



AAA 
Lys 



ATA GTA 
He Val 



TTA AGA 
Leu Arg 



ATT AAT 
He Asn 



AAT ACA 
Asn- Thr 
395 

AAT ATG 
Asn Met 
410 



CCT 
Pro 
380 

AAT 
Asn 



TTT AAA GAG AAA 
Phe Lys Glu Lys 
320 

GTA GAT AAA TTA 
Val Asp Lys Leu 
335 

TAC ACA GAG GAT 
Tyr Thr Glu Asp 
350 

ACA TAT TTG AAT 
Thr Tyr Leu Asn 
365 

AAG GTA AAT TAC 
Lys Val Asn Tyr 



TTA GCA 
Leu Ala 



AAA AAT TTT 
Lys Asn Phe 



GGG ATA 
Gly He 



ATC 
He 



AAT 
Asn 
465 

GAT 
Asp 



GAA 
Glu 
450 

AAT 
Asn 



CTA 
Leu 



ATA 
He 
435 

GGT 
Gly 



TGG 
Trp 

AAT 
Asn 



ACT GGA TTG 
Thr Gly Leu 
420 

ACT TCT AAA 
Thr Ser Lys 



TTT 
Phe 



ACT 
Thr 



CGT TGC GAT 
Arg Cys Asp 



GAC TTG 
Asp Leu 



GCA GAA GAA 
Ala Glu Glu 



TTT AAT TTT 
Phe Asn Phe 
515 



AAA GGA 
Lys Gly 
485 

AAT ATT 
Asn He 
500 

GAT AAT 
Asp Asn 



TTT 
Phe 
470 

GAA 

Glu 



AGT 
Ser 



GAA 
Glu 



GGG 
Gly 
4S5 

TTT 
Phe 



GAA 
Glu 



GAA TTT 
Glu Phe 
425 

AAA TCA 
Lys Ser 
440 

GCA TTA 
Ala Leu 



TAT AAG 
Tyr Lys 

TTA GAT 
Leu Asp 

AAT GAT 
Asn Asp 



AAT 
Asn 



TTG 
Leu 



TTT ACT 
Phe Thr 



AAA 
Lys 



AGT CCT 
Ser Pro 



ATT ACA 
lie Thr 



TTA 
Leu 



CCT 
Pro 



GAT TTA 
Asp Leu 
505 

GAA AAT 
Glu Asn 
520 



TCA GAA 
Ser Glu 
475 

TCT GAT 
Ser Asp 
490 

ATA CAA 
He Gin 



ATT TCA 
lie Ser 



TTA 
Leu 
460 

GAT 
Asp 



ACT 
Thr 



CTA TGT 
Leu Cys 
430 

GGA TAC 
Gly Tyr 
445 

TGT ATC 
Cys He 



GCA AAC 
Ala Asn 
400 

AAA CTA 
Lys Leu 
415 

GTA AGA 
Val Arg 



AAT AAG 
Asn Lys 

AAA GTT 
Lys Val 



AAT TTT 
Asn Phe 



AAT ATA 
Asn lie 



CAA 
Gin 



ATA 
lie 



AGT GAC ATT ATA GGC CAA TTA GAA CTT ATG CCT AAT 

Asn 
540 



Ser Asp He 
530 

CCT AAT GGA 
Pro Asn Glv 
545 



He Gly Gin Leu 
535 

AAA AAG TAT GAG 
Lys Lys Tyr Glu 
550 



Glu Leu Met Pro 



TAT TAT 
Tyr Tyr 
510 

GAA AAT 
Glu Asn 
525 

AXA-J3AA- 
He Glu 



ACT AAT 
Thr Asn 
480 

GAA GCA 
Glu Ala 
495 

TTA ACC 
Leu Thr 



CTT TCA 
Leu Ser 



.AGA TTT 



Arg Phe 



TTA GAT AAA TAT ACT ATG TTC CAT TAT 
Leu Asp Lys Tyr Thr Met Phe His Tyr 
555 560 



912 



960 



1008 



1056 



1104 



1152 



1200 



1248 



1296 



1344 



1392 



1440 



1488 



1536 



1584 



-1-6-32- 



1680 
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CTT CGT GCT CAA GAA TTT GAA CAT GGT AAA TCT AGG ATT GCT TTA AC A 17 2 8 

Leu Arg Ala Gin Glu Phe Glu His Glv Lys Ser Arg lie Ala Leu Th- 
565 " 570 575 

AAT TCT GTT AAC GAA GCA TTA TTA AAT CCT AGT CGT GTT TAT AC A TTT 17 7 6 

Asn Ser Val Asn Glu Ala Leu Leu Asn Pro Ser Arg Val Tyr Thr Phe 
580 585 590 

TTT TCT TCA GAC TAT GTA AAG AAA GTT AAT AAA GCT ACG GAG GCA GCT 18 2 4 

Phe Ser Ser Asp Tyr Val Lys Lys Val Asn Lys Ala Thr Glu Ala Ala 
595 600 60S 

ATG TTT TTA GGC TGG GTA GAA CAA TTA GTA TAT GAT TTT ACC GAT GAA 187 2 

Met Phe Leu Gly Trp Val Glu Gin Leu Val Tyr Asd Phe Thr Asp Glu 
610 615 620 

ACT AGC GAA GTA AGT ACT ACG GAT AAA ATT GCG GAT ATA ACT ATA ATT 192 0 

Thr Ser Glu Val Ser Thr Thr Asp Lys lie Ala Asp lie Thr lie lie 
625 630 635 640 

ATT CCA TAT ATA GGA CCT GCT TTA AAT ATA GGT AAT ATG TTA TAT AAA 196 8 

lie Pro Tyr lie Gly Pro Ala Leu Asn lie Gly Asn Met Leu Tyr Lys 
645 65C 655 

GAT GAT TTT GTA GGT GCT TTA ATA TTT TCA GGA GCT GTT ATT CTG TTA 2016 
Asp Asp Phe Val Gly Ala Leu lie Phe Ser Gly Ala Val He Leu Leu 
660 665 670 

GAA TTT ATA CCA GAG ATT GCA ATA CCT GTA TTA GGT ACT TTT GCA CTT 2 06 4 

Glu Phe He Pro Glu He Ala He Pro Val Leu Gly Thr Phe Ala Leu 
675 680 685 

GTA TCA TAT ATT GCG AAT AAG GTT CTA ACC GTT CAA ACA ATA GAT AAT 2112 
Val Ser Tyr He Ala Asn Lys Val Leu Thr Val Gin Thr He Asp Asn 
690 695 700 

GCT TTA AGT AAA AGA AAT GAA AAA TGG GAT GAG GTC TAT AAA TAT ATA 2160 
Ala Leu Ser Lys Arg Asn Glu Lys Trp Asp Glu Val Tyr Lys Tyr He 
705 710 715 720 

GTA ACA AAT TGG TTA GCA AAG GTT AAT ACA CAG ATT GAT CTA ATA AGA 22 0 8 

Val Thr Asn Trp Leu Ala Lys Val Asn Thr Gin He Asp Leu He Arg 
725 730 735 

AAA AAA ATG AAA GAA GCT TTA GAA AAT CAA GCA GAA GCA ACA AAG GCT 2256 
Lys Lys Met Lys Glu Ala Leu Glu Asn Gin Ala Glu Ala Thr Lys Ala 
740 745 750 

ATA ATA AAC TAT CAG TAT AAT CAA TAT ACT GAG GAA GAG AAA AAT AAT 2 3 04 

He He Asn Tyr Gin Tyr Asn Gin Tyr Thr Glu Glu Glu Lys Asn Asn 
755 760 765 

ATT AAT TTT AAT ATT GAT GAT TTA AGT TCG AAA CTT AAT GAG TCT ATA 2 3 52 

He Asn Phe Asn He Asp Asp Leu Ser Ser Lys Leu Asn Glu Ser He 
770 775 780 

AAT AAA GCT ATG ATT AAT ATA AAT AAA TTT TTG AAT CAA TGC TCT GTT 24 00 

Asn Lys Ala Met He Asn He Asn Lys Phe Leu Asn Gin Cys Ser Val 
785 790 795 800 

TCA TAT TTA ATG AAT TCT ATG ATC CCT TAT GGT GTT AAA CGG TTA GAA 244 8 

Ser Tyr Leu Met Asn Ser Met He Pro Tyr Gly Val Lys Arg Leu Glu 
805 810 815 

GAT TTT GAT GCT AGT CTT AAA GAT GCA TTA TTA AAG TAT ATA TAT GAT 24 96 

Asp Phe Asp Ala Ser Leu Lys Asp Ala Leu Leu Lys Tyr He Tyr Asp 
820 825 830 
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AAT AGA GGA ACT TTA ATT GGT CAA GTA GAT AGA TTA AAA GAT AAA GTT 
Asn Arg Gly Thr Leu He Gly Gin Val Asp Arg Leu Lys Asp Lys Val 
835 840 845 

AAT AAT ACA CTT AGT ACA GAT ATA CCT TTT CAG CTT TCC AAA TAC GTA 
Asn Asn Thr Leu Ser Thr Asp He Pro Phe Gin Leu Ser Lys Tyr Val 
850 855 860 



2544 



2592 



GAT AAT CAA AGA TTA TTA TCT ACA TTT ACT GAA TAT ATT AAG TCA GGC 2 64 0 

Asp Asn Gin Arg Leu Leu Ser Thr Phe Thr Glu Tyr He Lys Ser Gly 
865 870 875 880 

CTG AAT TCC CCG GGT GCA GCT CAT TAT GCG CAA CAC GAT GAA GCC GTA 268 8 

Leu Asn Ser Pro Gly Ala Ala His Tyr Ala Gin His Asp Glu Ala Val 
885 890 895 

GAC AAC AAA TTC AAC AAA GAA CAA CAA AAC GCG TTC TAT GAG ATC TTA 2 736 

Asp Asn Lys Phe Asn Lys Glu Gin Gin Asn Ala Phe Tyr Glu He Leu 
900 905 910 

CAT TTA CCT AAC TTA AAC GAA GAA CAA CGA AAC GCC TTC ATC CAA AGT 2784 
His Leu Pro Asn Leu Asn Glu Glu Gin Arg Asn Ala Phe lie Gin Ser 
915 920 925 

TTA AAA GAT GAC CCA AGC CAA AGC GCT AAC CTT TTA GCA GAA GCT AAA 283 2 

Leu Lys Asp Asp Pro Ser Gin Ser Ala Asn Leu Leu Ala Glu Ala Lys 
930 935 940 

AAG CTA AAT GAT GCT CAG GCG CCG AAA GTA GAC AAC AAA TTC AAC AAA 2 8 80 

Lys Leu Asn Asp Ala Gin Ala Pro Lys Val Asp Asn Lys Phe Asn Lys 
945 950 955 960 

GAA CAA CAA AAC GCG TTC TAT GAG ATC TTA CAT TTA CCT AAC TTA AAC 2 92 8 

Glu Gin Gin Asn Ala Phe Tyr Glu He Leu His Leu Pro Asn Leu Asn 
965 970 975 

GAA GAA CAA CGA AAC GCC TTC ATC CAA AGT TTA AAA GAT GAC CCA AGC 2 9 76 

Glu Glu Gin Arg Asn Ala Phe He Gin Ser Leu Lys Asp Asp Pro Ser 
980 985 990 

CAA AGC GCT AAC CTT TTA GCA GAA GCT AAA AAG CTA AAT GAT GCT CAG 3 024 

Gin Ser Ala Asn Leu Leu Ala Glu Ala Lys Lys Leu Asn Asp Ala Gin 
995 1000 1005 



GCG CCG AAA GTA GAC TAG 
Ala Pro Lys Val Asp * 
1010 



3042 



(2) INFORMATION FOR SEQ ID NO: 18: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1014 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 18: 

Met Gin Phe Val Asn Lys Gin Phe Asn Tyr Lys Asp Pro Val Asn Gly 
1 5 10 ■ 15 



Val Asp He Ala Tyr He Lys He Pro Asn Ala Gly Gin Met Gin Pro 

20 25 30 ' 

Val Lys Ala Phe Lys He His Asn Lys lie Trp Val He Pro Glu Arg 

35 40 * 45 
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Asp Thr Phe Thr Asn Pro Glu Glu Glv Asp Leu Asn Pro Pro Pro Glu 
50 55 ' 60 

Ala Lys Gin Val Pro Val Ser Tyr Tyr Asp Ser Thr Tyr Leu Ser Thr 
65 70 75 80 

Asp Asn Glu Lys Asp Asn Tyr Leu Lys Gly Val Thr Lys Leu Phe Glu 

85 90 95 

Arg lie Tyr Ser Thr Asp Leu Gly Arg Met Leu Leu Thr Ser lie Val 
100. 105 110 

Arg Gly lie Pro Phe Trp Gly Gly Ser Thr lie Asp Thr Glu Leu Lys 
115 120 125 

Val He Asp Thr Asn Cys lie Asn Val He Gin Pro Asp Gly Ser Tyr 
130 135 140 

Arg Ser Glu Glu Leu Asn Leu Val He He Gly Pro Ser Ala Asp lie 
145 150 155 ' 160 

He Gin Phe Glu Cys Lys Ser Phe Gly His Glu Val Leu Asn Leu Thr 
165 170 175 

Arg Asn Gly Tyr Gly Ser Thr Gin Tyr He Arg Phe Ser Pro Asp Phe 
180 185 190 

Thr Phe Gly Phe Glu Glu Ser Leu Glu Val Asp Thr Asn Pro Leu Leu 
195 200 205 

Gly Ala Gly Lys Phe Ala Thr Asp Pro Ala Val Thr Leu Ala His Glu 
210 215 220 

Leu He His Ala Gly His Arg Leu Tyr Gly He Ala He Asn Pro Asn 
225 230 235 240 

Arg Val Phe Lys Val Asn Thr Asn Ala Tyr Tyr Glu Met Ser Gly Leu 
24S 250 255 

Glu Val Ser Phe Glu Glu Leu Arg Thr Phe Gly Gly His Asp Ala Lys 
260 26 5 27 0 

Phe He Asp Ser Leu Gin Glu Asn Glu Phe Arg Leu Tyr Tyr Tyr Asn 
275 280 285 

Lys Phe Lys Asp He Ala Ser Thr Leu Asn Lys Ala Lys Ser He Val 
290 295 300 

Gly Thr Thr Ala Ser Leu Gin Tyr Met Lys Asn Val Phe Lys Glu Lys 
305 310 315 320 

Tyr Leu Leu Ser Glu Asp Thr Ser Gly Lys Phe Ser Val Asp Lys Leu 
325 330 335 

Lys Phe Asp Lys Leu Tyr Lys Met Leu Thr Glu He Tyr Thr Glu Asp 
340 345 350 

Asn Phe Val Lys Phe Phe Lys Val Leu Asn Arg Lys Thr Tyr Leu Asn 
355 360 365 

Phe Asp Lys Ala Val Phe Lys He Asn He Val Pro Lys Val Asn Tyr 
3 70 3 75 3 80 

Thr He Tyr Asp Gly Phe Asn Leu Arg Asn Thr Asn Leu Ala Ala Asn 
385 390 395 400 
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Phe Asn Gly Gin Asn Thr Glu He Asn Asn Met Asn Phe Thr Lys Leu 
405 410 415 

Lys Asn Phe Thr Gly Leu Phe Glu Phe Tyr Lys Leu Leu Cys Val Arg 
420 425 430 

Gly He He Thr Ser Lys Thr Lys Ser Leu Asp Lys Gly Tyr Asn Lys 
435 440 445 

He Glu Gly Arg Cys Asp Gly Ala Leu Asn Asp Leu Cys He Lys Val 
450 455 460 

Asn Asn. Trp Asp Leu Phe Phe Ser Pro Ser Glu Asp Asn Phe Thr Asn 
465 470 475 480 

Asp Leu Asn Lys Gly Glu Glu He Thr Ser Asp Thr Asn He Glu Ala 
465 490 495 

Ala Glu Glu Asn He Ser Leu Asp Leu He Gin Gin Tyr Tyr Leu Thr 
500 505 510 

Phe Asn Phe Asp Asn Glu Pro Glu Asn He Ser lie Glu Asn Leu Ser 
515 520 525 

Ser Asp He He Gly Gin Leu Glu Leu Met Pro Asn He Glu Arg Phe 
530 535 540 

Pro Asn Gly Lys Lys Tyr Glu Leu Asp Lys Tyr Thr Met Phe His Tyr 
545 550 555 560 

Leu Arg Ala Gin Glu Phe Glu His Gly Lys Ser Arg He Ala Leu Thr 
565 570 S75 

Asn Ser Val Asn Glu Ala Leu Leu Asn Pro Ser Arg Val Tyr Thr Phe 
580 585 590 

Phe Ser Ser Asp Tyr Val Lys Lys Val Asn Lys Ala Thr Glu Ala Ala 
595 600 605 

tiit, ,ucu uiy vcij. ox u uui i_.eu vax xyr nsp trie rnr Asp Glu 

610 615 620 

Thr Ser Glu Val Ser Thr Thr Asp Lys He Ala Asp He Thr He He 
625 630 635 640 

He Pro Tyr He Gly Pro Ala Leu Asn He Gly Asn Met Leu Tyr Lys 
645 650 655 

Asp Asp Phe Val Gly Ala Leu He Phe Ser Gly Ala Val He Leu Leu 
660 665 670 

Glu Phe He Pro Glu He Ala He Pro Val Leu Gly Thr Phe Ala -L,eu 
675 680 685 

Val Ser Tyr lie Ala Asn Lys Val Leu Thr Val Gin Thr lie Asp Asn 
690 695 700 



Ala Leu Ser Lys Arg Asn Glu Lys Trp Asp Glu Val Tyr Lys Tyr lie 
705 710 715 720 

Val Thr Asn Trp Leu Ala Lys Va l Asn Thr Gin He As p„Le u He Arg 
725 730 735 



Lys Lys Met Lys Glu Ala Leu Glu Asn Gin Ala Glu Ala Thr Lys Ala 
740 745 750 
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lie lie Asn Tyr Gin Tyr Asn Gin Tyr Thr Glu Glu Glu Lys Asn Asn 
755 760 765 

lie Asn Phe Asn lie Asp Asp Leu Ser Ser Lys Leu Asn Glu Ser lie 
770 775 780 

Asn Lys Ala Met lie Asn lie Asn Lys Phe Leu Asn Gin Cys Ser Val 
785 790 795 800 

Ser Tyr Leu Met Asn Ser Met lie Pro Tyr Gly Val Lys Arg Leu Glu 
805 810 815 

Asp Phe Asp Ala Ser Leu Lys Asp Ala Leu Leu Lys Tyr lie Tvr Asp 
820 825 830 

Asn Arg Gly Thr Leu lie Gly Gin Val Asp Arg Leu Lys Asp Lys Val 
835 840 845 

Asn Asn Thr Leu Ser Thr Asp lie Pro Phe Gin Leu Ser Lys Tyr Val 
850 855 860 

Asp Asn Gin Arg Leu Leu Ser Thr Phe Thr Glu Tyr lie Lys Ser Gly 
865 870 875 880 

Leu Asn Ser Pro Gly Ala Ala His Tyr Ala Gin His Asp Glu Ala Val 
885 890 895 

Asp Asn Lys Phe Asn Lys Glu Gin Gin Asn Ala Phe Tyr Glu lie Leu 
900 905 910 

His Leu Pro Asn Leu Asn Glu Glu Gin Arg Asn Ala Phe lie Gin Ser 
915 920 925 

Leu Lys Asp Asp- Pro Ser Gin Ser Ala Asn Leu Leu Ala Glu Ala Lys 
930 935 940 

Lys Leu Asn Asp Ala Gin Ala Pro Lys Val Asp Asn Lys Phe Asn Lys 
945 950 955 960 

Glu Gin Gin Asn Ala Phe Tyr Glu lie Leu His Leu Pro Asn Leu Asn 
965 970 975 

Glu Glu Gin Arg Asn Ala Phe lie Gin Ser Leu Lys Asp Asp Pro Ser 
980 985 990 

Gin Ser Ala Asn Leu Leu Ala Glu Ala Lys Lys Leu Asn Asp Ala Gin 
995 1000 1005 

Ala Pro Lys Val Asp * 
1010 

(2) INFORMATION FOR SEQ~^D NO: 19: 

<i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 3509 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

<ii) MOLECULE TYPE: DNA (genomic) 

(ix) FEATURE: 

(A) NAME/ KEY: CDS 

(B) LOCATION: 1 3509 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 19: 
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ATG CCA GTT ACA ATA AAT AAT TTT AAT TAT AAT GAT CCT ATT GAT AAT 
Met Pro Val Thr He- Asn Asn Phe Asn Tyr Asn Asp Pro He Asp Asn 
1 5. 10 15 



48 



AAT AAT ATT ATT ATG ATG GAG CCT CCA TTT GCG AGA GGT ACG GGG AGA 96 
Asn Asn He He Met Met Glu Pro Pro Phe Ala Arg Gly Thr Gly Arc 
20 25 30 

TAT TAT AAA GCT TTT AAA ATC ACA GAT CGT ATT TGG ATA ATA CCG GAA 144 
Tyr Tyr Lys Ala Phe Lys He Thr Asp Arg lie Trp He He Pro Glu 
35 40 45 

AGA TAT., ACT TTT GGA TAT AAA CCT GAG GAT TTT AAT AAA AGT TCC GGT 192 
Arg Tyr Thr Phe Gly Tyr Lys Pro Glu Asp Phe Asn Lys Ser Ser Gly 
50 55 60 

ATT TTT AAT AGA GAT GTT TGT GAA TAT TAT GAT CCA GAT TAC TTA AAT 240 
He Phe Asn Arg Asp Val Cys Glu Tyr Tyr Asp Pro Asp Tyr Leu Asn 
65 70 75 80 

ACT AAT GAT AAA AAG AAT ATA TTT TTA CAA ACA ATG ATC AAG TTA TTT 288 
Thr Asn Asp Lys Lys Asn He Phe Leu Gin Thr Met lie Lys Leu Phe 
8$. 90 95 

AAT AGA ATC AAA TCA AAA CCA TTG GGT GAA AAG TTA TTA GAG ATG ATT 336 
Asn Arg He Lys Ser Lys Pro Leu Gly Glu Lys Leu Leu Glu Met He 
100 105 110 

ATA AAT GGT ATA CCT TAT CTT GGA GAT AGA CGT GTT CCA CTC GAA GAG 384 
He Asn Gly He Pro Tyr Leu Gly Asp Arg Arg Val Pro Leu Glu Glu 
115 120 125 

TTT AAC ACA AAC ATT GCT AGT GTA ACT GTT AAT AAA TTA ATC AGT AAT 4 32 

Phe Asn Thr Asn He Ala Ser Val Thr Val Asn Lys Leu He Ser Asn 
130 135 140 

CCA GGA GAA GTG GAG CGA AAA AAA GGT ATT TTC GCA AAT TTA ATA ATA 4 80 

Pro Gly Glu Val Glu Arg Lys Lys Gly lie Phe Ala Asn Leu He He 
145 * 150 155 160 

TTT GGA CCT GGG CCA GTT TTA AAT GAA AAT GAG ACT ATA GAT ATA GGT 52 8 

Phe Gly Pro Gly Pro Val Leu Asn Glu Asn Glu Thr lie Asp He Gly 
165 170 175 

ATA CAA AAT CAT TTT GCA TCA AGG GAA GGC TTC GGG GGT ATA ATG CAA 576 
He Gin Asn His Phe Ala Ser Arg Glu Gly Phe Gly Gly lie Met Gin 
180 185 190 

ATG AAG TTT TGC CCA GAA TAT GTA AGC GTA TTT AAT AAT GTT CAA GAA 624 
Met Lys Phe Cys Pro Glu Tyr Val Ser Val Phe Asn Asn Val Gin Glu 
195 200 205 

AAC AAA GGC GCA AGT ATA TTT AAT AGA CGT GGA TAT TTT TCA GAT CCA 672 
Asn Lys Gly Ala Ser lie Phe Asn Arg Arg Gly Tyr Phe Ser Asp Pro 
210 215 220 

GCC TTG ATA TTA ATG CAT GAA CTT ATA CAT GTT TTA CAT GGA TTA TAT 720 
Ala Leu lie Leu Me£ His Glu Leu lie His Val Leu His Gly Leu Tyr 
225 230 235 240 

G GC ATT AAA GTA GAT GAT TTA CCA ATT GTA CCA AAT GAA AAA AAA TTT 768 

Gly lie Lys Val Asp Asp Leu Pro lie Val Pro Asn Glu Lys Lys Phe 
245 250 255 

TTT ATG CAA TCT ACA GAT GCT ATA CAG GCA GAA GAA CTA TAT ACA TTT 816 
Phe Met Gin Ser Thr Asp Ala He Gin Ala Glu Glu Leu Tyr Thr Phe 
260 265 270 
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GGA GGA CAA GAT CCC AGC ATC ATA ACT CCT TCT ACG GAT AAA AGT ATC 864 

Gly Gly Gin Asp Pro Ser lie lie Thr Pro Ser Thr Asp Lys Ser lie 

275 280 28S 

TAT GAT AAA GTT TTG CAA AAT TTT AGA GGG ATA GTT GAT AGA CTT AAC 912 
Tyr Asp Lys Val Leu Gin Asn Phe Arg Gly lie Val Asp Arg Leu Asn 
290 295 300 

AAG GTT TTA GTT TGC ATA TCA GAT CCT AAC ATT AAT ATT AAT ATA TAT 96 0 

Lys Val Leu Val Cys lie Ser Asp Pro Asn lie Asn lie Asn lie Tyr 
305 . 310 315 320 _ 

AAA AAT AAA TTT AAA GAT AAA TAT AAA TTC GTT GAA GAT TCT GAG GGA 1008 
Lys Asn Lys Phe Lys Asp Lys Tyr Lys Phe Val Glu Asp Ser Glu Gly 
325 330 335 

AAA TAT AGT ATA GAT GTA GAA AGT TTT GAT AAA TTA TAT AAA AGC TTA 10 56 

Lys Tyr Ser lie Asp Val Glu Ser Phe Asp Lys Leu Tyr Lys Ser Leu 
340 345 350 

ATG TTT GGT TTT ACA GAA ACT AAT ATA GCA GAA AAT TAT AAA ATA AAA 1104 
Met Phe Gly Phe Thr Glu Thr Asn lie Ala Glu Asn Tyr Lys lie Lys 
355 360 365 

ACT AGA GCT TCT TAT TTT AGT GAT TCC TTA CCA CCA GTA AAA ATA AAA 1152 
Thr Arg Ala Ser Tyr Phe Ser Asp Ser Leu Pro Pro Val Lys He Lys 
370 375 380 

AAT TTA TTA GAT AAT GAA ATC TAT ACT ATA GAG GAA GGG TTT AAT ATA 12 00 

Asn Leu Leu Asp Asn Glu He Tyr Thr He Glu Glu Gly Phe Asn lie 
385 390 395 400 

TCT GAT AAA GAT ATG GAA AAA GAA TAT AGA GGT CAG AAT AAA GCT ATA 124 8 

Ser Asp Lys Asp Met Glu Lys Glu Tyr Arg Gly Gin Asn Lys Ala lie 
405 410 415 

AAT AAA CAA GCT TAT GAA GAA ATT AGC AAG GAG CAT TTG GCT GTA TAT 12 9 6 

Asn Lys Gin Ala Tyr Glu Glu He Ser Lys Glu His Leu Ala Val Tyr 
420 425 430 

AAG ATA CAA ATG TGT AAA AGT GTT AAA GCT CCA GGA ATA TGT ATT GAT 13 4 4 

Lys He Gin Met Cys Lys Ser Val Lys Ala Pro Gly He Cys He Asp 
435 440 445 

GTT GAT AAT GAA GAT TTG TTC TTT ATA GCT GAT AAA AAT AGT TTT TCA 13 92 

Val Asp Asn Glu Asp Leu Phe Phe He Ala Asp Lys Asn Ser Phe Ser 
450 455 460 

GAT GAT TTA TCT AAA AAC GAA AGA ATA GAA TAT AAT ACA CAG AGT AAT 144 0 

Asp Asp Leu Ser Lys Asn Glu Arg He Glu Tyr Asn Thr Gin Ser Asn 
465 470 475 480 

TAT ATA GAA AAT GAC TTC CCT ATA AAT GAA TTA ATT TTA GAT ACT GAT 14 8 8 

Tyr He Glu Asn Asp Phe Pro He Asn Glu Leu He Leu Asp Thr Asp 
485 490 495 

TTA ATA AGT AAA ATA GAA TTA CCA AGT GAA AAT ACA GAA TCA CTT ACT 15 3 6 

Leu He Ser Lys He Glu Leu Pro Ser Glu Asn Thr Glu Ser Leu Thr 
500 505 510 

GAT TTT AAT GTA GAT GTT CCA GTA TAT GAA AAA CAA CCC GCT ATA AAA 15 8 4 

Asp Phe Asn Val Asp Val Pro Val Tyr Glu Lys Gin Pro Ala He Lys 
515 520 525 

AAA ATT TTT ACA GAT GAA AAT ACC ATC TTT CAA TAT TTA TAC TCT CAG 163 2 

Lys He Phe Thr Asp Glu Asn Thr He Phe Gin Tyr Leu Tyr Ser Gin 
530 535 540 
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ACA TTT CCT CTA GAT ATA AGA GAT ATA AGT TTA ACA TCT TCA TTT GAT 
Thr Phe Pro Leu Asp lie Arg Asp lie Ser Leu Thr Ser Ser Phe Asp 
545 550 555 560 

GAT GCA TTA TTA TTT TCT AAC AAA GTT TAT TCA TTT TTT TCT ATG GAT 
Asp Ala Leu Leu Phe Ser Asn Lys Val Tvr Ser Phe Phe Ser Met Asp 
565 570 575 



1680 



1728 



TAT ATT AAA ACT GCT AAT AAA GTG GTA GAA GCA GGA TTA TTT GCA GGT 
Tyr He Lys Thr Ala Asn Lys Val Val Glu Ala Gly Leu Phe Ala Gly 
580 585 590 



1776 



TGG GTG 'AAA CAG ATA GTA AAT GAT TTT GTA ATC GAA GCT AAT AAA AGC 
Trp Val Lys Gin He Val Asn Asp Phe Val He Glu Ala Asn Lys Ser 
595 600 60S 



1824 



AAT ACT ATG GAT AAA ATT GCA GAT ATA TCT CTA ATT GTT CCT TAT ATA 
Asn Thr Met Asp Lys He Ala Asp He Ser Leu lie Val Pro Tyr lie 
610 615 620 



1872 



GGA TTA GCT TTA AAT GTA GGA AAT GAA ACA GCT AAA GGA AAT TTT GAA 
Gly Leu Ala Leu Asn Val Gly Asn Glu Thr Ala Lys Gly Asn Phe Glu 
625 630 635 640 



1920 



AAT GCT TTT GAG ATT GCA GGA GCC AGT ATT CTA CTA GAA TTT ATA CCA 
Asn Ala Phe Glu He Ala Gly Ala Ser He Leu Leu Glu Phe He Pro 
645 650 655 



1968 



GAA CTT TTA ATA CCT GTA GTT GGA GCC TTT TTA TTA GAA TCA TAT ATT 
Glu Leu Leu He Pro Val Val Gly Ala Phe Leu Leu Glu Ser Tyr He 
660 665 670 



2016 



GAC AAT AAA AAT AAA ATT ATT AAA ACA ATA GAT AAT GCT TTA ACT AAA 
Asp Asn Lys Asn Lys He He Lys Thr He Asp Asn Ala Leu Thr Lys 
675 680 685 



2064 



AGA AAT GAA AAA TGG AGT GAT ATG TAC GGA TTA ATA GTA GCG CAA TGG 
Arg Asn Glu Lys Trp Ser Asp Met Tyr Gly Leu He Val Ala Gin Trp 
690 695 100 



2112 



CTC TCA ACA GTT AAT ACT CAA TTT TAT ACA ATA AAA GAG GGA ATG TAT 2160 
Leu Ser Thr Val Asn Thr Gin Phe Tyr Thr He Lys Glu Gly Met Tyr 
705 710 715 720 

AAG GCT TTA AAT TAT CAA GCA CAA GCA TTG GAA GAA ATA ATA AAA TAC 22 08 

Lys Ala Leu Asn Tyr Gin Ala Gin Ala Leu Glu Glu He He Lys Tyr 
725 730 735 

AGA TAT AAT ATA TAT TCT GAA AAA GAA AAG TCA AAT ATT AAC ATC GAT 2256 
Arg Tyr Asn lie Tyr Ser Glu Lys Glu Lys Ser Asn He Asn He Asp 
740 745 750 

TTT AAT GAT ATA AAT TCT AAA CTT AAT GAG GGT ATT AAC CAA GCT ATA 2304 
Phe Asn Asp He Asn Ser Lys Leu Asn Glu Gly He Asn Gin Ala He 
755 760 765 

GAT AAT ATA AAT AAT TTT ATA AAT GGA TGT TCT GTA TCA TAT TTA ATG 2 352 

Asp Asn He Asn Asn Phe He Asn Gly Cys Ser Val Ser Tyr Leu Met 
770 775 780 



-AAA-AAA— ATG-AT-T— GGA— TTA~GCT—GTA-GAA—AA^ 24'00 

Lys Lys Met He Pro Leu Ala Val Glu Lys Leu Leu Asp Phe Asp Asn 
785 790 795 800 



ACT CTC AAA AAA AAT TTG TTA AAT TAT ATA GAT GAA AAT AAA TTA TAT 24 4 8 

Thr Leu Lys Lys Asn Leu Leu Asn Tyr lie Asp Glu Asn Lys Leu Tyr 
805 810 815 
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TTG ATT GGA AGT GCA GAA TAT GAA AAA TCA AAA GTA AAT AAA TAC TTG 2 4 96 

Leu lie Gly Ser Ala Glu Tyr Glu Lys Ser Lys Val Asn Lys Tyr Leu 
820 825 830 

AAA ACC ATT ATG CCG TTT GAT CTT TCA ATA TAT ACC AAT GAT ACA ATA 2 54 4 

Lys Thr lie Met Pro Phe Asp Leu Ser lie Tyr Thr Asn Asp Thr lie 
835 840 845 

CTA ATA GAA ATG TTT AAT AAA TAT AAT AGC GAA ATT TTA AAT AAT ATT 2 5 92 

Leu lie Glu Met Phe Asn Lys Tyr Asn Ser Glu He Leu Asn Asn He 
850 855 860 

ATC TTA AAT TTA AGA TAT AAG GAT AAT AAT TTA ATA GAT TTA TCA GGA 264 0 

He Leu Asn Leu Arg Tyr Lys Asp Asn Asn Leu He Asp Leu Ser Gly 
865 870 875 880 

TAT GGG GCA AAG GTA GAG GTA TAT GAT GGA GTC GAG CTT AAT GAT AAA 2688 
Tyr Gly Ala Lys Val Glu Val Tyr Asp Gly Val Glu Leu Asn Asp Lys 
885 890 895 

AAT CAA TTT AAA TTA ACT AGT TCA GCA AAT AGT AAG ATT AGA GTG ACT 2 736 

Asn Gin Phe Lys Leu Thr Ser Ser Ala Asn Ser Lys He Arg Val Thr 
900 905 910 

CAA AAT CAG AAT ATC ATA TTT AAT AGT GTG TTC CTT GAT TTT AGC GTT 2 7 84 

Gin Asn Gin Asn lie He Phe Asn Ser Val Phe Leu Asp Phe Ser Val 
915 920 925 

AGC TTT TGG ATA AGA ATA CCT AAA TAT AAG AAT GAT GGT ATA CAA AAT 2 8 32 

Ser Phe Trp He Arg He Pro Lys Tyr Lys Asn Asp Gly He Gin Asn 
930 935 940 

TAT ATT CAT AAT GAA TAT ACA ATA ATT AAT TGT ATG AAA AAT AAT TCG 2 880 

Tyr He His Asn Glu Tyr Thr He He Asn Cys Met Lys Asn Asn Ser 
945 950 955 960 

GGC TGG AAA ATA TCT ATT AGG GGT AAT AGG ATA ATA TGG ACT TTA ATT 2 928 

Gly Trp Lys He Ser He Arg Gly Asn Arg He He Trp Thr Leu He 
965 970 975 

GAT ATA AAT GGA AAA ACC AAA TCG GTA TTT TTT GAA TAT AAC ATA AGA 2 97 6 

Asp He Asn Gly Lys Thr Lys Ser Val Phe Phe Glu Tyr Asn He Arg 
980 985 990 

GAA GAT ATA TCA GAG TAT ATA AAT AGA TGG TTT TTT GTA ACT ATT ACT 3 0 24 

Glu Asp He Ser Glu Tyr He Asn Arg Trp Phe Phe Val Thr lie Thr 
99S 1000 1005 

AAT AAT TTG AAT AAC GCT AAA ATT TAT ATT AAT GGT AAG CTA GAA TCA 3 0 72 

Asn Asn Leu Asn Asn Ala Lys He Tyr He Asn Gly Lys Leu Glu Ser 
1010 1015 1020 

AAT ACA GAT ATT AAA GAT ATA AGA GAA GTT ATT GCT AAT GGT GAA ATA 312 0 

Asn Thr Asp He Lys Asp He Arg Glu Val He Ala Asn Gly Glu He 
1025 1030 1035 1040 

ATA TTT AAA TTA GAT GGT GAT ATA GAT AGA ACA CAA TTT ATT TGG ATG 3168 
He Phe Lys Leu Asp Gly Asp He Asp Arg Thr Gin Phe He Trp Met 
1045 1050 1055 

AAA TAT TTC AGT ATT TTT AAT ACG GAA TTA AGT CAA TCA AAT ATT GAA 3 216 

Lys Tyr Phe Ser He Phe Asn Thr Glu Leu Ser Gin Ser Asn He Glu 
1060 1065 1070 

GAA AGA TAT AAA ATT CAA TCA TAT AGC GAA TAT TTA AAA GAT TTT TGG 3 26 4 

Glu Arg Tyr Lys He Gin Ser Tyr Ser Glu Tyr Leu Lys Asp Phe Trp 
1075 1080 1085 
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GGA AAT CCT TTA ATG TAC AAT AAA GAA TAT TAT ATG TTT AAT GCG GGG 3312 
Gly Asn Pro Leu Met Tyr Asn Lys Glu Tyr Tyr Met Phe Asn Ala Gly 
1090 1095 1100 

AAT AAA AAT TCA TAT ATT AAA CTA AAG AAA GAT TCA CCT GTA GGT GAA 33 60 

Asn Lys Asn Ser Tyr lie Lys Leu Lys Lys Asp Ser Pro Val Gly Glu 
1105 1110 1115 1120 

ATT TTA ACA CGT AGC AAA TAT AAT CAA AAT TCT AAA TAT ATA AAT TAT 34 0 8 

lie Leu Thr Arg Ser Lys Tyr Asn Gin Asn Ser Lys Tyr lie Asn Tyr 
1125 1130 1135 

AGA GAT TTA TAT ATT GGA GAA AAA TTT ATT ATA AG A AG A AAG TCA AAT 34 56 

Arg Asp Leu Tyr lie Gly Glu Lys Phe He He Arg Arg Lys Ser Asn 
1140 1145 1150 

TCT CAA TCT ATA AAT GAT GAT ATA GTT AGA AAA GAA GAT TAT ATA TAT 3 504 

Ser Gin Ser lie Asn Asp Asp lie Val Arg Lys Glu Asp Tyr He Tyr 
1155 1160 1165 



CTA GA 
Leu 



(2) INFORMATION FOR SEQ ID NO: 20: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 1169 amino acids 
(33) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 20: 

Met Pro Val Thr He Asn Asn Phe Asn Tyr Asn Asp Pro He Asp Asn 
15 10 15 

Asn AST! Tip* T 1 <» M<=» t- M<o t- P,l tl Dm Dm DVio fil a B m if TViy- CI,, 7\ — « 

20 25 30 

Tyr Tyr Lys Ala Phe Lys lie Thr Asp Arg lie Trp He He Pro Glu 
35 40 45 

Arg Tyr Thr Phe Gly Tyr Lys Pro Glu Asp Phe Asn Lys Ser Ser Gly 
50 55 60 

He Phe Asn Arg Asp Val Cys Glu Tyr Tyr Asp Pro Asp Tyr Leu Asn 
65 70 75 80 

Thr Asn Asp Lys Lys Asn He Phe Leu Gin Thr Met He Lys Leu Phe 
85 90 95 

Asn Arg He Lys Ser Lys Pro Leu Gly Glu Lys Leu Leu Glu Met He 
100 105 110 

He Asn Gly He Pro Tyr Leu Gly Asp Arg Arg Val Pro Leu Glu Glu 
115 120 125 

Phe Asn Thr Asn He Ala Ser Val Thr Val Asn Lys Leu He Ser Asn 
1-30 1-3-5 3^0 



3509 



Pro Gly Glu Val Glu Arg Lys Lys Gly lie Phe Ala Asn Leu He lie 

145 150 155 160 

Phe Gly Pro Gly Pro Val Leu Asn Glu Asn Glu Thr He Asp lie Gly 

165 170 175 
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He Gin Asn His Phe Ala Ser Arg Glu Gly Phe Gly Gly He Met Gin 
180 185 190 

Met Lys Phe Cys Pro Glu Tyr Val Ser Val Phe Asn Asn Val Gin Glu 
195 200 205 

Asn Lys Gly Ala Ser He Phe Asn Arg Arg Gly Tyr Phe Ser Asp Pro 
210 215 220 

Ala Leu He Leu Met His Glu Leu He His Val Leu His Gly Leu Tyr 
225 230 235 240 

Gly lie Lys Val Asp Asp Leu Pro He Val Pro Asn Glu Lys Lys Phe 
245 250 255 

Phe Met Gin Ser Thr Asp Ala He Gin Ala Glu Glu Leu Tyr Thr Phe 
260 265 270 

Gly Gly Gin Asp Pro Ser He He Thr Pro Ser Thr Asp Lys Ser He 
275 280 285 

Tyr Asp Lys Val Leu Gin Asn Phe Arg Gly He Vai Asp Arg Leu Asn 
290 295 300 

Lys Val Leu Val Cys He Ser Asp Pro Asn He Asn He Asn He Tyr 
305 310 315 320 

Lys Asn Lys Phe Lys Asp Lys Tyr Lys Phe Val Glu Asp Ser Glu Gly 
325 330 335 

Lys Tyr Ser He Asp Val Glu Ser Phe Asp Lys Leu Tyr Lys Ser Leu 
340 345 350 

Met Phe Gly Phe Thr Glu Thr Asn He Ala Glu Asn Tyr Lys He Lys 
3 55 3 60 3 65 

Thr Arg Ala Ser Tyr Phe Ser Asp Ser Leu Pro Pro Val Lys He Lys 
370 375 380 

Asn Leu Leu Asp Asn Glu lie Tyr Thr He Glu Glu Gly Phe Asn He 
385 390 395 400 

Ser Asp Lys Asp Met Glu Lys Glu Tyr Arg Giy Gin Asn Lys Ala He 
405 410 415 

Asn Lys Gin Ala Tyr Glu Glu He Ser Lys Glu His Leu Ala Val Tyr 
420 425 430 

Lys He Gin Met Cys Lys Ser Val Lys Ala Pro Gly He Cys He Asp 
435 440 445 

Val Asp Asn Glu Asp Leu Phe Phe He Ala Asp Lys Asn Ser Phe Ser 
450 455 460 

Asp Asp Leu Ser Lys Asn Glu Arg He Glu Tyr Asn Thr Gin Ser Asn 
465 470 475 480 

Tyr He Glu Asn Asp Phe Pro He Asn Glu Leu lie Leu Asp Thr Asp 
485 490 495 

Leu He Ser Lys He Glu Leu Pro Ser Glu Asn Thr Glu Ser Leu Thr 
500 505 510 

Asp Phe Asn Val Asp Val Pro Val Tyr Glu Lys Gin Pro Ala He Lys 
515 520 525 
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Lys lie Phe Thr Asp Glu Asn Thr lie Phe Gin Tyr Leu Tyr Ser Gin 
530 535 . 540 

Thr Phe Pro Leu Asp lie Arg Asp lie Ser Leu Thr Ser Ser Phe Asp 
545 550 555 5 60 

Asp Ala Leu Leu Phe Ser Asn Lys Val Tyr Ser Phe Phe Ser Met Asp 
565 570 575 

Tyr lie Lys Thr Ala Asn Lys Val Val Glu Ala Gly Leu Phe Ala Gly 
580 585 590 

Trp Val Lys Gin lie Val Asn Asp Phe Val lie Glu Ala Asn Lys Ser 
595 600 60S 

Asn Thr Met Asp Lys lie Ala Asp lie Ser Leu lie Val Pro Tyr lie 
610 615 620 

Gly Leu Ala Leu Asn Val Gly Asn Glu Thr Ala Lys Gly Asn Phe Glu 
625 630 635 640 

Asn Ala Phe Glu lie Ala Gly Ala Ser lie Leu Leu Glu Phe lie Pro 
645 650 655 

Glu Leu Leu lie Pro Val Val Gly Ala Phe Leu Leu Glu Ser Tyr lie 
660 665 670 

Asp Asn Lys Asn Lys lie lie Lys Thr lie Asp Asn Ala Leu Thr Lys 
675 680 685 

Arg Asn Glu Lys Trp Ser Asp Met Tyr Gly Leu lie Val Ala Gin Trp 
690 695 700 

Leu Ser Thr Val Asn Thr Gin Phe Tyr Thr lie Lys Glu Gly Met Tyr 
705 710 715 720 

Lys Ala Leu Asn Tyr Gin Ala Gin Ala Leu Glu Glu lie lie Lys Tyr 
725 730 735 

' ^ ■=) ±2^ 1 - 1- j-yo V3iu Ajyjj JCi Mill 11C ASH liG HSp 

740 745 750 

Phe Asn Asp lie Asn Ser Lys Leu Asn Glu Gly lie Asn Gin Ala lie 
7S5 760 765 

Asp Asn lie Asn Asn Phe lie Asn Gly Cys Ser Val Ser Tyr Leu Met 
770 775 780 

Lys Lys Met lie Pro Leu Ala Val Glu Lys Leu Leu Asp Phe Asp Asn 
785 790 795 800 

Thr Leu Lys Lys Asn Leu Leu Asn Tyr lie Asp Glu Asn Lys Leu Tyr 
B05 810 815 

Leu lie Gly Ser Ala Glu Tyr Glu Lys Ser Lys Val Asn Lys Tyr Leu 
820 825 830 

Lys Thr lie Met Pro Phe Asp Leu Ser lie Tyr Thr Asn Asp Thr lie 
835 840 845 

Leu lie Glu Met Phe Asn Lys Tyr Asn Ser Glu lie Leu Asn Asn I le 

850 855 860 



He Leu Asn Leu Arg Tyr Lys Aso Asn Asn Leu He Asp Leu Ser Gly 
865 870 875 880 
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Tyr Gly Ala Lys Val Glu Val Tyr Asp Gly Val Glu Leu Asn Asp Lys 
885 * 890 895 

Asn Gin Phe Lys Leu Thr Ser Ser Ala Asn Ser Lys lie Arg Val Thr 
900 905 910 

Gin Asn Gin Asn lie lie Phe Asn Ser Val Phe Leu Asp Phe Ser Val 
915 920 925 

Ser Phe Trp lie Arg lie Pro Lys Tyr Lys Asn Asp Gly lie Gin Asn 
93 0 93 5 94 0 

Tyr lie His Asn Glu Tyr Thr lie He Asn Cys Met Lys Asn Asn Ser 
945 950 955 960 

Gly Trp Lys He Ser He Arg Gly Asn Arg He He Trp Thr Leu He 
965 970 975 

Asp He Asn Gly Lys Thr Lys Ser Val Phe Phe Glu Tyr Asn He Arg 
980 985 990 

Glu Asp He Ser Glu Tyr He Asn Arg Trp Phe Phe Val Thr He Thr 
995 1000 1005 

Asn Asn Leu Asn Asn Ala Lys He Tyr He Asn Gly Lys Leu Glu Ser 
1010 1015 1020 

Asn Thr Asp lie Lys Asp He Arg Glu Val He Ala Asn Gly Glu He 
1025 1030 1035 1040 

He Phe Lys Leu Asp Gly Asp He Asp Arg Thr Gin Phe He Trp Met 
1045 1050 1055 

Lys Tyr Phe Ser He Phe Asn Thr Glu Leu Ser Gin Ser Asn He Glu 
1060 1065 1070 

Glu Arg Tyr Lys He Gin Ser Tyr Ser Glu Tyr Leu Lys Asp Phe Trp 
1075 1080 1085 

Gly Asn Pro Leu Met Tyr Asn Lys Glu Tyr Tyr Met Phe Asn Ala Gly 
1090 1095 1100 

Asn Lys Asn Ser Tyr He Lys Leu Lys Lys Asp Ser Pro Val Gly Glu 
1105 1110 1H5 1120 

He Leu Thr Arg Ser Lys Tyr Asn Gin Asn Ser Lys Tyr He Asn Tyr 
1125 1130 1135 

Arg Asp Leu Tyr He Gly Glu Lys Phe lie He Arg Arg Lys Ser Asn 
1140 1145 1150 

Ser Gin Ser He Asn Asp Asp He Val Arg Lys Glu Asp Tyr He Tyr 
1155 1160 H65 

Leu 



(2) INFORMATION FOR SEQ ID NO: 21: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2574 base pairs 

(B) TYPE: nucleic acid 

( C ) STRANDEDNES S : doub 1 e 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
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(ix) FEATURE: 

(A) NAME/ KEY: CDS 

(B) LOCATION : 1 . .2 574 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 21: 

ATG CCA GTT ACA ATA AAT AAT TTT AAT TAT AAT GAT CCT ATT GAT AAT 

Met Pro Val Thr He Asn Asn Phe Asn Tyr Asn Asp Pro He Asp Asn 
15 io 15 



48 



AAT AAT ATT ATT ATG ATG GAG CCT CCA TTT GCG AGA GGT ACG GGG AGA 96 
Asn Asn lie He Met Met Glu Pro Pro Phe Ala Arg Gly Thr Gly Arg 
. 20 25 30 

TAT TAT AAA GCT TTT AAA ATC ACA GAT CGT ATT TGG ATA ATA CCG GAA 144 
Tyr Tyr Lys Ala Phe Lys He Thr Asp Arg He Trp He He Pro Glu 
35 40 45 

AGA TAT ACT TTT GGA TAT AAA CCT GAG GAT TTT AAT AAA AGT TCC GGT 192 
Arg Tyr Thr Phe Gly. Tyr Lys Pro Glu Asp Phe Asn Lys Ser Ser Glv 
50 55 60 

ATT TTT AAT AGA GAT GTT TGT GAA TAT TAT GAT CCA GAT TAC TTA AAT 240 
He Phe Asn Arg Asp Val Cys Glu Tyr Tyr Asp Pro Asp Tyr Leu Asn 
65 70 75 80 

ACT AAT GAT AAA AAG AAT ATA TTT TTA CAA ACA ATG ATC AAG TTA TTT 288 
Thr Asn Asp Lys Lys Asn He Phe Leu Gin Thr Met He Lys Leu Phe 
85 90 . 95 

AAT AGA ATC AAA TCA AAA CCA TTG GGT GAA AAG TTA TTA GAG ATG ATT 336 
Asn Arg He Lys Ser Lys Pro Leu Gly Glu Lys Leu Leu Glu Met lie 
100 105 no 

ATA AAT GGT ATA CCT TAT CTT GGA GAT AGA CGT GTT CCA CTC GAA GAG 384 
He Asn Gly He Pro Tyr Leu Gly Asp Arg Arg Val Pro Leu Glu Glu 
115 120 125 

■iTT AAL ACA AAC ATT GCT AGT GTA ACT GTT AAT AAA TTA ATC AGT AAT 432 
Phe Asn Thr Asn He Ala Ser Val Thr Val Asn Lys Leu He Ser Asn 
130 135 140 

CCA GGA GAA GTG GAG CGA AAA AAA GGT ATT TTC GCA AAT TTA ATA ATA 4 80 

Pro Gly Glu Val Glu Arg Lys Lys Gly He Phe Ala Asn Leu He He 
145 150 155 160 

TTT GGA CCT GGG CCA GTT TTA AAT GAA AAT GAG ACT ATA GAT ATA GGT 528 
Phe Gly Pro Gly Pro Val Leu Asn Glu Asn Glu Thr He Asp He Gly 
165 170 175 

ATA CAA AAT CAT "TTT GCA TCA AGG GAA GGC TTC GGG GGT ATA ATG CAA 576 
He Gin Asn His Phe Ala Ser Arg Glu Gly Phe Gly Gly He Met Gin 
180 185 190 

ATG AAG TTT TGC CCA GAA TAT GTA AGC GTA TTT AAT AAT GTT CAA GAA 624 
Met Lys Phe Cys Pro Glu Tyr Val Ser Val Phe Asn Asn Val Gin Glu 
195 200 205 

AAC AAA GGC GCA AGT ATA TTT AAT AGA CGT GGA TAT TTT TCA GAT CCA 672 

^sn-I^s^ly~Ala-Sex~I4^^he-Asn-Ar^^ 

210 215 220 

GCC TTG ATA TTA ATG CAT GAA CTT ATA CAT GTT TTA CAT GGA TTA TAT 720 
Ala Leu He Leu Met His Glu Leu He His Val Leu His Gly Leu Tyr 
225 230 235 240 
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GGC ATT AAA GTA GAT GAT TTA CCA ATT GTA CCA AAT GAA AAA AAA TTT 76 8 

Gly lie Lys Val Asp Asp Leu Pro He Val Pro Asn Glu Lys Lys Phe 
245 250 255 

TTT ATG CAA TCT ACA GAT GCT ATA CAG GCA GAA GAA CTA TAT ACA TTT 816 
Phe Met Gin Ser Thr Asp Ala lie Gin Ala Glu Glu Leu Tyr Thr Phe 
260 265 270 

GGA GGA CAA GAT CCC AGC ATC ATA ACT CCT TCT ACG GAT AAA AGT ATC 8 64 

Gly Gly Gin Asp Pro Ser He He Thr Pro Ser Thr Asp Lys Ser He 
275 280 285 

TAT GAT AAA GTT TTG CAA AAT TTT AGA GGG ATA GTT GAT AGA CTT AAC 912 
Tyr Asp Lys Val Leu Gin Asn Phe Arg Gly He Val Asp Arg Leu Asn 
290 295 300 

AAG GTT TTA GTT TGC ATA TCA GAT CCT AAC ATT AAT ATT AAT ATA TAT 96 0 

Lys Val Leu Val Cys He Ser Asd Pro Asn He Asn He Asn He Tyr 
305 310 315 320 

AAA AAT AAA TTT AAA GAT AAA TAT AAA TTC GTT GAA GAT TCT GAG GGA 100 8 

Lys Asn Lys Phe Lys Asp Lys Tyr Lys Phe Val Glu Asp Ser Glu Gly 
325 330 335 

AAA TAT AGT ATA GAT GTA GAA AGT TTT GAT AAA TTA TAT AAA AGC TTA 10 5 6 

Lys Tyr Ser He Asp Val Glu Ser Phe Asp Lys Leu Tyr Lys Ser Leu 
340 345 350 

ATG TTT GGT TTT ACA GAA ACT AAT ATA GCA GAA AAT TAT AAA ATA AAA 1104 
Met Phe Gly Phe Thr Glu Thr Asn He Ala Glu Asn Tyr Lys He Lys 
355 360 365 

ACT AGA GCT TCT TAT TTT AGT GAT TCC TTA CCA CCA GTA AAA ATA AAA 115 2 

Thr Arg Ala Ser Tyr Phe Ser Asp Ser Leu Pro Pro Val Lys He Lys 
370 375 380 

AAT TTA TTA GAT AAT GAA ATC TAT ACT ATA GAG GAA GGG TTT AAT ATA 1200 
Asn Leu Leu Asp Asn Glu lie Tyr Thr He Glu Glu Gly Phe Asn He 
385 390 395 400 

TCT GAT AAA GAT ATG GAA AAA GAA TAT AGA GGT CAG AAT AAA GCT ATA 12 4 8 

Ser .Asp Lys Asp Met Glu Lys Glu Tyr Arg Gly Gin Asn Lys Ala He 
405 410 415 

AAT AAA CAA GCT TAT GAA GAA ATT AGC AAG GAG CAT TTG GCT GTA TAT 12 9 6 

Asn Lys Gin Ala Tyr Glu Glu He Ser Lys Glu His Leu Ala Val Tyr 
420 425 430 

AAG ATA CAA ATG TGT AAA AGT GTT AAA GCT CCA GGA ATA TGT ATT GAT 13 4 4 

Lys He Gin Met Cys Lys Ser Val Lys Ala Pro Gly He Cys He Asp 
435 440 445 

GTT GAT AAT GAA GAT TTG TTC TTT ATA GCT GAT AAA AAT AGT TTT TCA 13 92 

Val Asp Asn Glu Asp Leu Phe Phe He Ala Asp Lys Asn Ser Phe Ser 
450 455 460 

GAT GAT TTA TCT AAA AAC GAA AGA ATA GAA TAT AAT ACA CAG AGT AAT 14 4 0 

Asp Asp Leu Ser Lys Asn Glu Arg He Glu Tyr Asn Thr Gin Ser Asn 
465 470 475 480 

TAT ATA GAA AAT GAC TTC CCT ATA AAT GAA TTA ATT TTA GAT ACT GAT 14 8 8 

Tyr He Glu Asn Asp Phe Pro He Asn Glu Leu He Leu Asp Thr Asp 
485 490 495 

TTA ATA AGT AAA ATA GAA TTA CCA AGT GAA AAT ACA GAA TCA CTT ACT 15 3 6 

Leu He Ser Lys He Glu Leu Pro Ser Glu Asn Thr Glu Ser Leu Thr 
500 505 510 
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GAT TTT AAT GTA GAT GTT CCA GTA TAT GAA AAA CAA CCC GCT ATA AAA 1584 
Asp Phe Asn Val Asp Val Pro Val Tyr Glu Lys Gin Pro Ala lie Lys 
515 520 525 

AAA ATT TTT ACA GAT GAA AAT ACC ATC TTT CAA TAT TTA TAC TCT CAG 16 32 

Lys lie Phe Thr Asp Glu Asn Thr He Phe Gin Tyr Leu Tyr Ser Gin 
530 535 540 

ACA TTT CCT CTA GAT ATA AGA GAT ATA AGT TTA ACA TCT TCA TTT GAT 1680 
Thr Phe Pro Leu Asp He Arg Asp He Ser Leu Thr Ser Ser Phe Asp 
545 550 555 560 

GAT GCA' TTA TTA TTT TCT AAC AAA GTT TAT TCA TTT TTT TCT ATG GAT 1728 
Asp Ala Leu Leu Phe Ser Asn Lys Val Tyr Ser Phe Phe Ser Met Asp 
565 570 575 

TAT ATT AAA ACT GCT AAT AAA GTG GTA GAA GCA GGA TTA TTT GCA GGT 1776 
Tyr He Lys Thr Ala Asn Lys Val Val Glu Ala Gly Leu Phe Ala Gly 
580 585 590 

TGG GTG AAA CAG ATA GTA AAT GAT TTT GTA ATC GAA GCT AAT AAA AGC 1824 
Trp Val Lys Gin He Val Asn Asp Phe Val He Glu Ala Asn Lys Ser 
595 600 605 

AAT ACT ATG GAT AAA ATT GCA GAT ATA TCT CTA ATT GTT CCT TAT ATA 1872 
Asn Thr Met Asp Lys He Ala Asp lie Ser Leu He Val Pro Tyr lie 
610 615 620 

GGA TTA GCT TTA AAT GTA GGA AAT GAA ACA GCT AAA GGA AAT TTT GAA 1920 
Gly Leu Ala Leu Asn Val Gly Asn Glu Thr Ala Lys Gly Asn Phe Glu 
625 630 635 640 

AAT GCT TTT GAG ATT GCA GGA GCC AGT ATT CTA CTA GAA TTT ATA CCA 1968 
Asn Ala Phe Glu lie Ala Gly Ala Ser lie Leu Leu Glu Phe He Pro 
645 650 655 

GAA CTT TTA ATA CCT GTA GTT GGA GCC TTT TTA TTA GAA TCA TAT ATT 2 016 

Glu Leu Leu lie Pro Val Val Gly Ala Phe Leu Leu Glu Ser Tyr lie 

O&U O t) 3 t)/U 

GAC AAT AAA AAT AAA ATT ATT AAA ACA ATA GAT AAT GCT TTA ACT AAA 2064 
Asp Asn Lys Asn Lys lie lie Lys Thr lie Asp Asn Ala Leu Thr Lys 
675 680 685 

AGA AAT GAA AAA TGG AGT GAT ATG TAC GGA TTA ATA GTA GCG CAA TGG 2112 
Arg Asn Glu Lys Trp Ser Asp Met Tyr Gly Leu lie Val Ala Gin Trp 
690 695 700 

CTC TCA ACA GTT AAT ACT CAA TTT TAT ACA ATA AAA GAG GGA ATG TAT 2160 
Leu Ser Thr Val Asn Thr Gin Phe Tyr Thr lie Lys Glu Gly Met Tyr 
705 710 715 720 

AAG GCT TTA AAT TAT CAA GCA CAA GCA TTG GAA GAA ATA ATA AAA TAC 2208 
Lys Ala Leu Asn Tyr Gin Ala Gin Ala Leu Glu Glu lie lie Lys Tyr 
725 730 735 

AGA TAT AAT ATA TAT TCT GAA AAA GAA AAG TCA AAT ATT AAC ATC GAT 22 56 

Arg Tyr Asn lie Tyr Ser Glu Lys Glu Lys Ser Asn lie Asn lie Asp 
740 745 750 

TTT AAT GAT ATA AAT TCT AAA CTT AAT GAG GGT ATT AAC CAA GCT ATA 2304 
Phe Asn Asp lie Asn Ser Lys Leu Asn Glu Gly lie Asn Gin Ala lie 
755 760 765 

GAT AAT ATA AAT AAT TTT ATA AAT GGA TGT TCT GTA TCA TAT TTA ATG 2 352 

Asp Asn lie Asn Asn Phe He Asn Gly Cys Ser Val Ser Tyr Leu Met 
770 775 780 
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AAA AAA ATG ATT CCA TTA GCT GTA GAA AAA TTA CTA GAC TTT GAT AAT 2 4 00 

Lys Lys Met lie Pro Leu Ala Val Glu Lys Leu Leu Asp Phe Asp Asn 
785 790 795 800 

ACT CTC AAA AAA AAT TTG TTA AAT TAT ATA GAT GAA AAT AAA TTA TAT 24 4 8 

Thr Leu Lys Lys Asn Leu Leu Asn Tyr lie Asp Glu Asn Lys Leu Tyr 
805 810 815 

TTG ATT GGA AGT GCA GAA TAT GAA AAA TCA AAA GTA AAT AAA TAC TTG 2 4 96 

Leu lie Gly Ser Ala Glu Tyr Glu Lys Ser Lys Val Asn Lys Tyr Leu 
820 825 830 

AAA ACC ATT ATG CCG TTT GAT CTT TCA ATA TAT ACC AAT GAT ACA ATA 2 5 44 

Lys Thr lie Met Pro Phe Asp Leu Ser lie Tyr Thr Asn Asp Thr lie 
835 840 845 

CTA ATA GAA ATG TTT AAT AAA TAT AAT AGC 2 5 74 

Leu lie Glu Met Phe Asn Lys Tyr Asn Ser 
850 855 



(2) INFORMATION FOR SEQ ID NO: 22: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 85 8 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 22: 

Met Pro Val Thr lie Asn Asn Phe Asn Tyr Asn Asp Pro lie Asp Asn 
15 10 15 

Asn Asn lie He Met Met Glu Pro Pro Phe Ala Arg Gly Thr Gly Arg 
20 25 30 

Tyr Tyr Lys Ala Phe Lys He Thr Asp Arg He Trp He He Pro Glu 
35 40 45 

Arg Tyr Thr Phe Gly Tyr Lys Pro Glu Asp Phe Asn Lys Ser Ser Gly 
50 55 60 

He Phe Asn Arg Asp Val Cys Glu Tyr Tyr Asp Pro Asp Tyr Leu Asn 
65 70 75 80 

Thr Asn Asp Lys Lys Asn He Phe Leu Gin Thr Met He Lys Leu Phe 
85 90 95 

Asn Arg He Lys Ser Lys Pro Leu Gly Glu Lys Leu Leu Glu Met He 
100 105 110 

He Asn Gly He Pro Tyr Leu Gly Asp Arg Arg Val Pro Leu Glu Glu 
115 120 125 

Phe Asn Thr Asn He Ala Ser Val Thr Val Asn Lys Leu He Ser Asn 
130 135 140 

Pro Gly Glu Val Glu Arg Lys Lys Gly He Phe Ala Asn Leu He He 
145 150 155 160 

Phe Gly Pro Gly Pro Val Leu Asn Glu Asn Glu Thr He Asp He Gly 
165 170 175 

He Gin Asn His Phe Ala Ser Arg Glu Gly Phe Gly Gly He Met Gin 
180 185 190 
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Met Lys Phe Cys Pro Glu Tyr Val Ser Val Phe Asn Asn Val Gin Glu 
195 200 205 

Asn Lys Gly Ala Ser He Phe Asn Arg Arg Gly Tvr Phe Ser Asp Pro 
210 215 220 

Ala Leu He Leu Met His Glu Leu He His Val Leu His Gly Leu Tvr 
225 230 235 2 40 

Gly He Lys Val Asp Asp Leu Pro He Val Pro Asn Glu Lys Lys Phe 
245 250 255 

Phe Met Gin Ser Thr Asp Ala He Gin Ala Glu Glu Leu Tyr Thr Phe 
260 265 270 

Gly Gly Gin Asp Pro Ser He He Thr Pro Ser Thr Asp Lys Ser He 
275 280 285 

Tyr Asp Lys Val Leu Gin Asn Phe Arg Gly He Val Asp Arg Leu Asn 
290 295 300 

Lys Val Leu Val Cys lie Ser Asp Pro Asn He Asn He Asn He Tvr 
305 310 315 320 

Lys Asn Lys Phe Lys Asp Lys Tyr Lys Phe Val Glu Asp Ser Glu Gly 
325 330 335 

Lys Tyr Ser He Asp Val Glu Ser Phe Asp Lys Leu Tyr Lys Ser Leu 
340 345 350 

Met Phe Gly Phe Thr Glu Thr Asn He Ala Glu Asn Tyr Lys He Lvs 
355 360 365 

Thr Arg Ala Ser Tyr Phe Ser Asp Ser Leu Pro Pro Val Lys lie Lys 
370 375 380 

Asn Leu Leu Asp Asn Glu He Tyr Thr He Glu Glu Gly Phe Asn lie 
385 390 395 400 

OCi ^ys asp men uiu Lys Glu Tyr Arg Gly Gin Asn Lys Ala lie 

405 410 415 

Asn Lys Gin Ala Tyr Glu Glu lie Ser Lys Glu His Leu Ala Val Tyr 
420 425 430 

Lys He Gin Met Cys Lys Ser Val Lys Ala Pro Gly lie Cys lie Asp 
435 440 445 

Val Asp Asn Glu Asp Leu Phe Phe He Ala Asp Lys Asn Ser Phe Ser 
450 455 460 

Asp Asp Leu Ser Lys Asn Glu "Arg lie Glu Tyr Asn Thr Gin Ser Asn 
465 470 47S 480 

Tyr lie Glu Asn Asp Phe Pro lie Asn Glu Leu He Leu Asp Thr Asp 
485 490 495 

Leu lie Ser Lys lie Glu Leu Pro Ser Glu Asn Thr Glu Ser Leu Thr 
500 505 510 



Asp Phe_Aan_Val Asp_V.al P ro ^^l^Tyr^l^^Lys-Gl-n-Pro--A^a- 
515 520 525 



bys- 



Lys He Phe Thr Asp Glu Asn Thr lie Phe Gin Tyr Leu Tyr Ser Gin 
530 535 540 
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Thr Phe Pro Leu Asp lie Arc Asp lie Ser Leu Thr Ser Ser Phe Asp 
545 550 " 555 560 

Asp Ala Leu Leu Phe Ser Asn Lys Val Tyr Ser Phe Phe Ser Met Asp 
565 57C 575 

Tyr lie Lys Thr Ala Asn Lys Val Val Glu Ala Gly Leu Phe Ala Gly 
580 585 590 

Trp Val Lys Gin lie Val Asn Asp Phe Val lie Glu Ala Asn Lys Ser 
595 600 605 

Asn Thr Met Asp Lys lie Ala Asp lie Ser Leu lie Val Pro Tyr lie 
610 615 620 

Gly Leu Ala Leu Asn Val Gly Asn Glu Thr Ala Lys Gly Asn Phe Glu 
625 -630 635 640 

Asn Ala Phe Glu lie Ala Gly Ala Ser lie Leu Leu Glu Phe lie Pro 
645 650 655 

Glu Leu Leu lie Pro Val Val Gly Ala Phe Leu Leu Glu Ser Tyr lie 
660 665 670 

Asp Asn Lys Asn Lys lie lie Lys Thr lie Asp Asn Ala Leu Thr Lys 
675 680 685 

Arg Asn Glu Lys Trp Ser Asp Met Tyr Gly Leu lie Val Ala Gin Trp 
690 695 700 

Leu Ser Thr Val Asn Thr Gin Phe Tyr Thr lie Lys Glu Gly Met Tyr 
705 710 715 720 

Lys Ala Leu Asn Tyr Gin Ala Gin Ala Leu Glu Glu lie lie Lys Tyr 
72 5 7 30 73 5 

Arg Tyr Asn lie Tyr Ser Glu Lys Glu Lys Ser Asn lie Asn lie Asp 
740 745 750 

Phe Asn Asp lie Asn Ser Lys Leu Asn Glu Gly lie Asn Gin Ala lie 
755 760 765 

Asp Asn lie Asn Asn Phe lie Asn Gly Cys Ser Val Ser Tyr Leu Met 
770 775 780 

Lys Lys Met lie Pro Leu Ala Val Glu Lys Leu Leu Asp Phe Asp Asn 
785 790 795 800 

Thr Leu Lys Lys Asn Leu Leu Asn Tyr lie Asp Glu Asn Lys Leu Tyr 
805 810 815 

Leu lie Gly Ser Ala Glu Tyr Glu Lys Ser Lys Val Asn Lys Tyr Leu 
820 825 830 

Lys Thr lie Met Pro Phe Asp Leu Ser lie Tyr Thr Asn Asp Thr lie 
835 840 845 

Leu lie Glu Met Phe Asn Lys Tyr Asn Ser 
850 855 

(2) INFORMATION FOR SEQ ID NO: 23: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1644 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: DMA (genomic) 

(ix) FEATURE: 

(A) NAME/ KEY : CDS 

(B) LOCATION : 1 . . 1644 



<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 23: 



ATG CCA GTT ACA ATA AAT AAT TTT AAT TAT AAT GAT CCT ATT GAT AAT 
Met Pro Val Thr lie Asn Asn Phe Asn Tyr Asn Asp Pro lie Asp Asn 
15 10 IS 



48 



AAT AAT ATT ATT ATG ATG GAG CCT CCA TTT GCG AGA GGT ACG GGG AGA 
Asn Asn lie lie Met Met Glu Pro Pro Phe Ala Arg Gly Thr Glv Arg 
20 25 30 



96 



TAT TAT AAA GCT TTT AAA ATC ACA GAT CGT ATT TGG ATA ATA CCG GAA 
Tyr Tyr Lys Ala Phe Lys lie Thr Asp Arg lie Trp lie He Pro Glu 
35 40 45 



144 



AGA TAT ACT TTT GGA TAT AAA CCT GAG GAT TTT AAT AAA AGT TCC GGT 
Arg Tvr Thr Phe Gly Tyr Lys Pro Glu Asp Phe Asn Lys Ser Ser Gly 
50 55 60 



192 



ATT TTT AAT AGA GAT GTT TGT GAA TAT TAT GAT CCA GAT TAC TTA AAT 
He Phe Asn Arg Asp Val Cys Glu Tyr Tyr Asp Pro Asp Tyr Leu Asn 
65 70 75 6C 



240 



ACT AAT GAT AAA AAG AAT ATA TTT TTA CAA ACA ATG ATC AAG TTA TTT 
Thr Asn Asp Lys Lys Asn He Phe Leu Gin Thr Met He Lys Leu Phe 
85 90 95 



288 



AAT AGA ATC AAA TCA AAA CCA TTG GGT GAA AAG TTA TTA GAG ATG ATT 
Asn Arg He Lys Ser Lys Pro Leu Gly Glu Lys Leu Leu Glu Met He 
100 105 ' 110 



336 



ATA AAT GGT ATA CCT TAT CTT GGA GAT AGA CGT GTT CCA CTC GAA GAG 
lie Asn Gly He Pro Tyr Leu Gly Asp Arg Arg Val Pro Leu Glu Glu 



384 



TTT AAC ACA AAC ATT GCT AGT GTA ACT GTT AAT AAA TTA ATC AGT AAT 
Phe Asn Thr Asn lie Ala Ser Val Thr Val Asn Lys Leu He Ser Asn 
130 135 140 



432 



CCA GGA GAA GTG GAG CGA AAA AAA GGT ATT TTC GCA AAT TTA ATA ATA 
Pro Gly Glu Val Glu Arg Lys Lys Gly He Phe Ala Asn Leu He He 
145 150 ' 155 160 



480 



TTT GGA CCT GGG CCA GTT TTA AAT GAA AAT GAG ACT ATA GAT ATA GGT 
Phe Gly Pro Gly Pro Val Leu Asn Glu Asn Glu Thr He Asp lie Gly 
165 170 175 



528 



ATA CAA AAT CAT TTT GCA TCA AGG GAA GGC TTC GGG GGT ATA ATG CAA 
He Gin Asn His Phe Ala Ser Arg Glu Gly Phe Gly Gly lie Met Gin 
180 185 190 



576 



ATG AAG TTT TGC CCA GAA TAT GTA AGC GTA TTT AAT AAT GTT CAA GAA 
Met Lys Phe Cys Pro Glu Tyr Val Ser Val Phe Asn Asn Val Gin Glu 
195 200 205 



624 



AAC AAA GGC GCA AGT ATA TTT AAT AGA CGT GGA TAT TTT TCA GAT CCA 
Asn Lys Gly Ala Ser He Phe Asn Arg Arg Gly Tyr Phe Ser Asp Pro 
210 215 220 



672 
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GCC TTG ATA TTA ATG CAT GAA CTT ATA CAT GTT TTA CAT GGA TTA TAT 720 

Ala Leu lie Leu Met His Glu Leu lie His Val Leu His Giv Leu Tyr 

225 230 235 * 240 

GGC ATT AAA GTA GAT GAT TTA CCA ATT GTA CCA AAT GAA AAA AAA TTT 763 

Gly lie Lys Val Asp Asp Leu Pro lie Val Pro Asn Glu Lys Lys Phe 
245 250 255 

TTT ATG CAA TCT AC A GAT GCT ATA CAG GCA GAA GAA CTA TAT ACA TTT 816 

Phe Met Gin Ser Thr Asp Ala lie Gin Ala Glu Glu Leu Tyr Thr Phe 
260 265 270 

GGA GGA CAA GAT CCC AGC ATC ATA ACT CCT TCT ACG GAT AAA AGT ATC 864 

Gly Gly Gin Asp Pro Ser lie lie Thr Pro Ser Thr Asp Lys Ser He 
275 280 285 

TAT GAT AAA GTT TTG CAA AAT TTT AGA GGG ATA GTT GAT AGA CTT AAC 912 

Tyr Asp Lys Val Leu Gin Asn Phe Arg Gly He Val Asp Arg Leu Asn 
290 295 300 

AAG GTT TTA GTT TGC ATA TCA GAT CCT AAC ATT AAT ATT AAT ATA TAT 960 

Lys Val Leu Val Cys He Ser Asp Pro Asn He Asn He Asn He Tyr 

305 310 315 320 

AAA AAT AAA TTT AAA GAT AAA TAT AAA TTC GTT GAA GAT TCT GAG GGA 10 0 8 

Lys Asn Lys Phe Lys Asp Lys Tyr Lys Phe Val Glu Asp Ser Glu Gly 
325 330 335 

AAA TAT AGT ATA GAT GTA GAA AGT TTT GAT AAA TTA TAT AAA AGC TTA 105 6 

Lys Tyr Ser He Asp Val Glu Ser Phe Asp Lys Leu Tyr Lys Ser Leu 
340 345 350 

ATG TTT GGT TTT ACA GAA ACT AAT ATA GCA GAA AAT TAT AAA ATA AAA 1104 

Met Phe Gly Phe Thr Glu Thr Asn lie Ala Glu Asn Tyr Lys He Lys 
355 360 365 

ACT AGA GCT TCT TAT TTT AGT GAT TCC TTA CCA CCA GTA AAA ATA .AAA 115 2 

Thr Arg Ala Ser Tyr Phe Ser Asp Ser Leu Pro Pro Val Lys He Lys 
370 375 380 

AAT TTA TTA GAT AAT GAA ATC TAT ACT ATA GAG GAA GGG TTT AAT ATA 12 0 0 

Asn Leu Leu Asp Asn Glu He Tyr Thr He Glu Glu Gly Phe Asn He 

385 390 395 400 

TCT GAT AAA GAT ATG GAA AAA GAA TAT AGA GGT CAG AAT AAA GCT ATA 124 8 

Ser Asp Lys Asp Met Glu Lys Glu Tyr Arg Gly Gin Asn Lys Ala He 
405 410 415 

AAT AAA CAA GCT TAT GAA GAA ATT AGC AAG GAG CAT TTG GCT GTA TAT 12 9 6 

Asn Lys Gin Ala Tyr Glu Glu He Ser Lys Glu His Leu Ala Val Tyr 
420 425 430 

AAG ATA CAA ATG TGT AAA AGT GTT AAA GCT CCA GGA ATA TGT ATT GAT 13 4 4 

Lys He Gin Met Cys Lys Ser Val Lys Ala Pro Gly He Cys He Asp 
435 440 445 

GTT GAT AAT GAA GAT TTG TTC TTT ATA GCT GAT AAA AAT AGT TTT TCA 13 92 

Val Asp Asn Glu Asp Leu Phe Phe He Ala Asp Lys Asn Ser Phe Ser 
450 455 460 

GAT GAT TTA TCT AAA AAC GAA AGA ATA GAA TAT AAT ACA CAG AGT AAT 14 4 0 

Asp Asp Leu Ser Lys Asn Glu Arg He Glu Tyr Asn Thr Gin Ser Asn 

465 470 475 480 

TAT ATA GAA AAT GAC TTC CCT ATA AAT GAA TTA ATT TTA GAT ACT GAT 14 8 8 

Tyr He Glu Asn Asp Phe Pro He Asn Glu Leu He Leu Asp Thr Asp 
485 490 495 



WO 98/07864 



PCT/GB97/02273 



103 - 



TTA ATA AGT AAA ATA GAA TTA CCA AGT GAA AAT ACA GAA TCA CTT ACT 
Leu He Ser Lys He Glu Leu Pro Ser Glu Asn Thr Glu Ser Leu Thr 
500 505 510 

GAT TTT AAT GTA GAT GTT CCA GTA TAT GAA AAA CAA CCC GCT ATA AAA 
Asp Phe Asn Val Asp Val Pro Val Tyr Glu Lys Gin Pro Ala He Lys 
515 520 525 

AAA ATT TTT ACA GAT GAA AAT ACC ATC TTT CAA TAT TTA TAC TCT CAG 
Lys He Phe Thr Asp Glu Asn Thr lie Phe Gin Tyr Leu Tyr Ser Gin 
530 535 540 

ACA TTT CCT CTA 
Thr Phe Pro Leu 
545 



1536 



1584 



1632 



1644 



(2) INFORMATION FOR SEQ ID NO: 24: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 54 8 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 24: 

Met Pro Val Thr He Asn Asn Phe Asn Tyr Asn Asp Pro He Asp Asn 
15 10 15 

Asn Asn He He Met Met Glu Pro Pro Phe Ala Arg Gly Thr Glv Ara 
20 25 30 

Tyr Tyr Lys Ala Phe Lys He Thr Asp Arg He Trp He He Pro Glu 
35 40 45 

Arg Tyr Thr Phe Gly Tyr Lys Pro Glu Asp Phe Asn Lys Ser Ser Gly 
50 55 60 y 



65 



r-.a.i vdx iyj: iyr t\sp Fro Asp Tyr Leu Asn 

70 75 80 



Thr Asn Asp Lys Lys Asn He Phe Leu Gin Thr Met He Lys Leu Phe 
65 90 95 

Asn Arg He Lys Ser Lys Pro Leu Gly Glu Lys Leu Leu Glu Met lie 
100 105 no 

lie Asn Gly He Pro Tyr Leu Gly Asp Arg Arg Val Pro Leu Glu Glu 
H5 120 125 

Phe Asn Thr Asn He Ala Ser Val Thr Val Asn" Lys Leu He Ser Asn 
130 135 140 

Pro Gly Glu Val Glu Arg Lys Lys Gly He Phe Ala Asn Leu He He 
145 150 155 160 

Phe Gly Pro Gly Pro Val Leu Asn Glu Asn Glu Thr lie Asp He Gly 
165 170 175 

lie Gin Asn His Phe Ala Ser A rg„Glji_GLyi^he„^ 

180 185 190 



Met Lys Phe Cys Pro Glu Tyr Val Ser Val Phe Asn Asn Val Gin Glu 
195 200 205 
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Asn Lys 
210 

Ala Leu 
225 

Gly lie 

Phe Met 

Gly Gly 

Tyr Asp 
290 

Lys Val 
305 

Lys Asn 
Lys Tyr 



Met Phe 



Thr Arg 
370 

Asn Leu 
385 

Ser Asp 

Asn Lys 

Lys lie 

Val Asp 
450 

Asp Asp 
465 

Tyr lie 

Leu lie 

Asp Phe 

Lys lie 
530 

Thr Phe 
545 



Gly Ala 

lie Leu 

Lys Val 

Gin Ser 
260 

Gin Asp 
275 

Lys Val 

Leu Val 

Lys Phe 

Ser lie 
340 

Gly Phe 
355 

Ala Ser 

Leu Asp 

Lys Asp 

Gin Ala 
420 

Gin Met 
435 

Asn Glu 

Leu Ser 

Glu Asn 

Ser Lys 
500 

Asn Val 
515 

Phe Thr 
Pro Leu 



Ser lie 



Met His 
230 

Asp Asp 
245 

Thr Asp 

Pro Ser 

Leu Gin 

Cys lie 
310 

Lys Asp 
325 

Asp Val 

Thr Glu 

Tyr Phe 

Asn Glu 
390 

Met Glu 
405 

Tyr Glu 



Cys Lys 



Asp Leu 

Lys Asn 
470 

Asp Phe 
485 

lie Glu 
Asp Val 
Asp Glu 



Phe Asn 
215 

Glu Leu 

Leu Pro 

Ala He 

lie He 
280 

Asn Phe 
295 

Ser Asp 

Lys Tyr 

Glu Ser 

Thr Asn 
360 

Ser Asp 
375 

lie Tyr 

Lys Glu 

Glu He 

Ser Val 
440 

Phe Phe 
455 

Glu Arg 

Pro He 

Leu Pro 

Pro Val 
520 

Asn Thr 
535 



Arg Arg 

He His 

He Val 
250 

Gin Ala 
265 

Thr Pro 
Arg Gly 



Pro Asn 

Lys Phe 
330 

Phe Asp 
345 

He Ala 

Ser Leu 

Thr He 

Tyr Arg 
410 

Ser Lys 
425 

Lys Ala 

He Ala 

He Glu 

Asn Glu 
490 

Ser Glu 
505 

Tyr Glu 
He Phe 



104 - 

Gly Tyr 
220 

Val Leu 
235 

Pro Asn 

Glu Glu 

Ser Thr 

He Val 
300 

He Asn 
315 

Val Glu 
Lys Leu 



Glu Asn 

Pro Pro 
380 

Glu Glu 
395 

Gly Gin 

Glu- His 

Pro Gly 

Asp Lys 
460 

Tyr Asn 
475 

Leu He 

Asn Thr 

Lys Gin 

Gin Tyr 
540 



Phe Ser Asp Pro 



His Gly Leu Tyr 
240 

Glu Lys Lys Phe 
255 

Leu Tyr Thr Phe 
270 

Asp Lys Ser lie 
285 

Asp Arg Leu Asn 



He Asn He Tyr 
320 

Asp Ser Glu Gly 
335 



Tyr Lys Ser Leu 
350 

Tyr Lys He Lys 
365 

Val Lys He Lys 



Gly Phe Asn He 
400 

Asn Lys Ala He 
415 

Leu Ala Val Tyr 
430 

He Cys He Asp 
445 

Asn Ser Phe Ser 



Thr Gin Ser Asn 
480 

Leu Asp Thr Asp 
495 

Glu Ser Leu Thr 
510 

Pro Ala He Lys 
525 

Leu Tyr Ser Gin 
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(2) INFORMATION FOR SEQ ID NO : 25: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2616 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(ix) FEATURE: 

(A) NAME /KEY : CDS 

(B) LOCATION : 1 . .2616 



(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 25; 



ATG CAG TTC GTG AAC AAG CAG TTC AAC TAT AAG GAC CCT GTA AAC GGT 
Met Gin Phe Val Asn Lys Gin Phe Asn Tyr Lys Asp Pro Val Asn Gly 
15 10 15 

GTT GAC ATT GCC TAC ATC AAA ATT CCA AAC GCC GGC CAG ATG CAG CCG 
Val Asp lie Ala Tyr lie Lys He Pro Asn Ala Gly Gin Met Gin Pro 
20 25 30 

GTG AAG GCT TTC AAG ATT CAT AAC AAA ATC TGG GTT ATT' CCG GAA CGC 
Val Lys Ala Phe Lys He His Asn Lys He Trp Val He Pro Glu Arg 
35 40 45 

GAT ACA TTT ACG AAC CCG GAA GAA GGA GAC TTG AAC CCG CCG CCG GAA" 
Asp Thr Phe Thr Asn Pro Glu Glu Gly Asp Leu Asn Pro Pro Pro Glu 
50 55 60 

GCA AAG CAG GTG CCA GTT TCA TAC TAC GAT TCA ACC TAT CTG AGC ACA 
Ala Lys Gin Val Pro Val Ser Tyr Tyr Asp Ser Thr Tyr Leu Ser Thr 
65 ? 0 75 80 

GAC AAC GAG AAG GAT AAC TAC CTG AAG GGA GTG ACC AAA TTA TTC GAG 

A e rs Sen Hi n T — 



7\ r-ws 7\ ^ - 

8*5 



48 



96 



144 



192 



240 



288 



90 



95 



CGT ATT TAT TCC ACT GAC CTG GGC CGT ATG CTG CTG ACC TCA ATC GTC 
Arg He Tyr Ser Thr Asp Leu Gly Arg Met Leu Leu Thr Ser lie Val 
100 105 no 

CGC GGA ATC CCA TTT TGG GGT GGC AGT ACC ATT GAC ACG GAG TTG AAG 
Arg Gly He Pro Phe Trp Gly Gly Ser Thr He Asp Thr Glu Leu Lys 
115 120 125 

GTT ATT GAC ACT AAC TGC ATT AAC GTG ATC CAA CCA GAC GGT AGC TAC 
Val He Asp Thr Asn Cys He Asn Val He Gin Pro Asp Gly Ser Tyr 
130 135 .140 

AGA TCT GAA GAA CTT AAC CTC GTA ATC ATC GGG CCC TCC GCG GAC ATT 
Arg Ser Glu Glu Leu Asn Leu Val He He Gly Pro Ser Ala Asp He 
145 150 155- 160 

ATC CAG TTT GAG TGC AAG AGC TTT GGC CAC GAA GTG TTG AAC CTG ACG 
He Gin Phe Glu Cys Lys Ser Phe Gly His Glu Val Leu Asn Leu Thr 
165 170 175 



336 



384 



432 



480 



528 



CGT AAC GGT TAC GGC TCT ACT CAG TAC ATT CGT TTC AGC CCA GAC TTC 
Arg Asn Gly Tyr Gly Ser Thr Gin Tyr He Arg Phe Ser Pro Asp Phe 
180 185 190 



576 
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ACG TTC GGT TTC GAG GAG AGC CTG GAG GTT GAT ACC AAC CCG CTG TTG 624 
Thr Phe Gly Phe Glu Glu Ser Leu Glu Val Asp Thr Asn Pro Leu Leu 
195 200 205 

GGT GCA GGC AAG TTC GCA ACT GAT CCA GCG GTG ACC CTG GCA CAC GAG 5 72 

Gly Ala Gly Lys Phe Ala Thr Asd Pro Ala Val Thr Leu Ala His Glu 
210 215 220 

CTG ATC CAC GCC GGT CAT CGT CTG TAT GGC ATT GCG ATT AAC CCG AAC 72 0 

Leu lie His Ala Gly His Arg Leu Tyr Gly lie Ala lie Asn Pro Asn 
225 230 235 240 

CGC GTG TTC' AAG GTT AAC ACC AAC GCC TAC TAC GAG ATG AGT GGT TTA 76 8 

Arg Val Phe Lys Val Asn Thr Asn Ala Tyr Tyr Glu Met Ser Gly Leu 
245 250 255 

GAA GTA AGC TTC GAG GAA CTG CGC ACG TTC GGT GGC CAT GAT GCG AAG 816 
Glu Val Ser Phe Glu Glu Leu Arg Thr Phe Gly Gly His Asp Ala Lys 
260 265 270 

TTT ATC GAC AGC TTG CAG GAG AAC GAG TTC CGT CTG TAC TAC TAC AAC 86 4 

Phe lie Asp Ser Leu Gin Glu Asn Glu Phe Arg Leu Tyr Tyr Tyr Asn 
275 280 285 

AAG TTT AAA GAT ATT GCA AGT ACA CTG AAC AAG GCT AAG TCC ATT GTG 912 
Lys Phe Lys Asp lie Ala Ser Thr Leu Asn Lys Ala Lys Ser lie Val 
290 295 300 

GGT ACC ACT GCT TCA TTA CAG TAT ATG AAA AAT GTT TTT AAA GAG AAA 96 0 

Gly Thr Thr Ala Ser Leu Gin Tyr Met Lys Asn Val Phe Lys Glu Lys 
305 310 315 320 

TAT CTC CTA TCT GAA GAT ACA TCT GGA AAA TTT TCG GTA GAT AAA TTA 100 8 

Tyr Leu Leu Ser Glu Asp Thr Ser Gly Lys Phe Ser Val Asp Lys Leu 
325 330 33 5 

AAA TTT GAT AAG TTA TAC AAA ATG TTA ACA GAG ATT TAC ACA GAG GAT 10 5 6 

Lys Phe Asp Lys Leu Tyr Lys Met Leu Thr Glu lie Tyr Thr Glu Asp 
340 345 350 

AAT TTT GTT AAG TTT TTT AAA GTA CTT AAC AG A AAA ACA TAT TTG AAT 1104 
Asn Phe Val Lys Phe Phe Lys Val Leu Asn Arg Lvs Thr Tyr Leu Asn 
355 360 * 365 

TTT GAT AAA GCC GTA TTT AAG ATA AAT ATA GTA CCT AAG GTA AAT TAC 115 2 

Phe Asp Lys Ala Val Phe Lys lie Asn lie Val Pro Lys Val Asn Tyr 
370 375 380 

ACA ATA TAT GAT GGA TTT AAT TTA AGA AAT ACA AAT TTA GCA GCA AAC 120 0 

Thr lie Tyr Asp Gly Phe Asn Leu Arg Asn Thr Asn Leu Ala Ala Asn 
385 390 395 400 

TTT AAT GGT CAA AAT ACA GAA ATT AAT AAT ATG AAT TTT ACT AAA CTA 124 8 

Phe Asn Gly Gin Asn Thr Glu lie Asn Asn Met Asn Phe Thr Lys Leu 
405 410 415 

AAA AAT TTT ACT GGA TTG TTT GAA TTT TAT AAG TTG CTA TGT GTA AGA 12 96 

Lys Asn Phe Thr Gly Leu Phe Glu Phe Tyr Lys Leu Leu Cys Val Arg 
420 425 430 

GGG ATA ATA ACT TCT AAA ACT AAA TCA TTA GAT AAA GGA TAC AAT AAG 1344 
Gly lie lie Thr Ser Lys Thr Lys Ser Leu Asp Lys Gly Tyr Asn Lys 
435 440 445 

GCA TTA AAT GAT TTA TGT ATC AAA GTT AAT AAT TGG GAC TTG TTT TTT 13 9 2 

Ala Leu Asn Asp Leu Cys lie Lys Val Asn Asn Trp Asp Leu Phe Phe 
450 455 460 
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AGT CCT TCA GAA GAT AAT TTT ACT AAT GAT CTA AAT AAA GGA GAA GAA 
Ser Pro Ser Glu Asp Asn Phe Thr Asn Asp Leu Asn Lys Gly Glu Glu 
465 470 475 480 

ATT ACA TCT GAT ACT AAT ATA GAA GCA GCA GAA GAA AAT ATT AGT TTA 
lie Thr Ser Asp Thr Asn He Glu Ala Ala Glu Glu Asn He Ser Leu 
485 490 495 

GAT TTA ATA CAA CAA TAT TAT TTA ACC TTT AAT TTT GAT AAT GAA CCT 
Asp Leu He Gin Gin Tyr Tyr Leu Thr Phe Asn Phe Asp Asn Glu Pro 
500 505 sio 

GAA AAT. ATT TCA ATA GAA AAT CTT TCA AGT GAC ATT ATA GGC CAA TTA 
Glu Asn He Ser He Glu Asn Leu Ser Ser Asp He He Gly Gin Leu 
515 520 525 

GAA CTT ATG CCT AAT ATA GAA AGA TTT CCT AAT GGA AAA AAG TAT GAG 
Glu Leu Met Pro Asn He Glu Arg Phe Pro Asn Gly Lys Lys Tyr Glu 
530 535 540 



1440 



1488 



1536 



1584 



1632 



TTA GAT AAA TAT ACT ATG TTC CAT TAT CTT CGT GCT CAA GAA TTT GAA 1680 
Leu Asp Lys Tyr Thr Met Phe His Tyr Leu Arg Ala Gin Glu Phe Glu 
545 550 555 560 

CAT GGT AAA TCT AGG ATT GCT TTA ACA AAT TCT GTT AAC GAA GCA TTA 1728 
His Gly Lys Ser Arg He Ala Leu Thr Asn Ser Val Asn Glu Ala Leu 
565 570 575 

TTA AAT CCT AGT CGT GTT TAT ACA TTT TTT TCT TCA GAC TAT GTA AAG 1776 
Leu Asn Pro Ser Arg Val Tyr Thr Phe Phe Ser Ser Asp Tyr Val Lvs 
580 585 590 

AAA GTT AAT AAA GCT ACG GAG GCA GCT ATG TTT TTA GGC TGG GTA GAA 1824 
Lys Val Asn Lys Ala Thr Glu Ala Ala Met Phe Leu Gly Trp Val Glu 
595 600 605 

CAA TTA GTA TAT GAT TTT ACC GAT GAA ACT AGC GAA GTA AGT ACT ACG 1872 
Gin Leu Val Tyr Asp Phe Thr Asp Glu Thr Ser Glu Val Ser Thr Thr 
610 615 620 

GAT AAA ATT GCG GAT ATA ACT ATA ATT ATT CCA TAT ATA GGA CCT GCT 1920 
Asp Lys He Ala Asp He Thr He He He Pro Tyr He Glv Pro Ala 
62 5 630 635 * 640 

TTA AAT ATA GGT AAT ATG TTA TAT AAA GAT GAT TTT GTA GGT GCT TTA 196 8 

Leu Asn He Gly Asn Met Leu Tyr Lys Asp Asp Phe Val Gly Ala Leu 
645 650 655 

ATA TTT TCA GGA GCT GTT ATT CTG TTA GAA TTT ATA CCA GAG ATT GCA 2016 
He Phe Ser Gly Ala Val He Leu Leu Glu Phe He Pro Glu He Ala 
660 665 670 

ATA CCT GTA TTA GGT ACT TTT GCA CTT GTA TCA TAT ATT GCG AAT AAG 2064 
He Pro Val Leu Gly Thr Phe Ala Leu Val Ser Tyr He Ala Asn Lys 
675 680 685 

GTT CTA ACC GTT CAA ACA ATA GAT AAT GCT TTA AGT AAA AGA AAT GAA 2112 
Val Leu Thr Val Gin Thr He Asp Asn Ala Leu Ser Lys Arg Asn Glu 
690 695 700 



AAA TGG GAT G AG GTC T AT AAA TAT ATA GTA ACA AAT TGG TTA GCA AAG 

Lys Trp Asp Glu Val Tyr Lys Tyr He Val Thr Asn Trp Leu Ala Lys 

70S 710 715 720 

GTT AAT ACA CAG ATT GAT CTA ATA AGA AAA AAA ATG AAA GAA GCT TTA 

Val Asn Thr Gin He Asp Leu He Arg Lys Lys Met Lys Glu Ala Leu 

725 730 735 



„2a£0_ 
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GAA AAT CAA GCA GAA GCA ACA AAG GCT ATA ATA AAC TAT CAG TAT AAT 225 6 

Glu Asn Gin Ala Glu Ala Thr Lys Ala lie lie Asn Tyr Gin Tyr Asn 
740 745 750 

CAA TAT ACT GAG GAA GAG AAA AAT AAT ATT AAT TTT AAT ATT GAT GAT 2 3 04 

Gin Tyr Thr Glu Glu Glu Lys Asn Asn lie Asn Phe Asn lie Asp Asp 
755 760 765 

TTA AGT TCG AAA CTT AAT GAG TCT ATA AAT AAA GCT ATG ATT AAT ATA 2 3 52 

Leu Ser Sef Lys Leu Asn Glu Ser lie Asn Lys Ala Met lie Asn lie 
770 775 780 

AAT AAA TTT' TTG AAT CAA TGC TCT GTT TCA TAT TTA ATG AAT TCT ATG 2 4 00 

Asn Lys Phe Leu Asn Gin Cys Ser Val Ser Tyr Leu Met Asn Ser Met 
785 790 795 800 

ATC CCT TAT GGT GTT AAA CGG TTA GAA GAT TTT GAT GCT AGT CTT AAA 244 8 

lie Pro Tyr Gly Val Lys Arg Leu Glu Asp Phe Asp Ala Ser Leu Lys 
805 810 815 

GAT GCA TTA TTA AAG TAT ATA TAT GAT AAT AGA GGA ACT TTA ATT GGT 24 96 

Asp Ala Leu Leu Lys Tyr lie Tyr Asp Asn Arg Gly Thr Leu lie Gly 
820 825 830 

CAA GTA GAT AGA TTA AAA GAT AAA GTT AAT AAT ACA CTT AGT ACA GAT 2 54 4 

Gin Val Asp Arg Leu Lys Asp Lys Val Asn Asn Thr Leu Ser Thr Asp 
835 840 845 

ATA CCT TTT CAG CTT TCC AAA TAC GTA GAT AAT CAA AGA TTA TTA TCT 2 5 92 

lie Pro Phe Gin Leu Ser Lys Tyr Val Asp Asn Gin Arg Leu Leu Ser 
850 855 860 

ACA TTT ACT GAA TAT ATT AAG TAA 2 616 

Thr Phe Thr Glu Tyr lie Lys * 
86 5 8 70 



(2) INFORMATION FOR SEQ ID NO : 26: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 872 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 26: 

Met Gin Phe Val Asn Lys Gin Phe Asn Tyr Lys Asp Pro Val Asn Gly 
15 10 15 

Val Asp lie Ala Tyr lie Lys lie Pro Asn Ala Gly Gin Met Gin Pro 
20 25 30 

Val Lys Ala Phe Lys lie His Asn Lys He Trp Val He Pro Glu Arg 
35 40 45 

Asp Thr Phe Thr Asn Pro Glu Glu Gly Asp Leu Asn Pro Pro Pro Glu 
50 55 60 

Ala Lys Gin Val Pro Val Ser Tyr Tyr Asp Ser Thr Tyr Leu Ser Thr 
65 70 75 80 

Asp Asn Glu Lys Asp Asn Tyr Leu Lys Gly Val Thr Lys Leu Phe Glu 
85 . 90 95 

Arg He Tyr Ser Thr Asp Leu Gly Arg Met Leu Leu Thr Ser He Val 
100 105 110 
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Arg Gly He Pro Phe Trp Gly Gly Ser Thr He Asp Thr Glu Leu Lys 
115 120 125 

Val He Asp Thr Asn Cys He Asn Val He Gin Pro Asd Gly Ser Tyr 
130 135 140 

Arg Ser Glu Glu Leu Asn Leu Val He lie Gly Pro Ser Ala Asp He 
145 150 155 160 

He Gin Phe Glu Cys Lys Ser Phe Gly His Glu Val Leu Asn Leu Thr 
165 170 175 

Arg Asn Gly Tyr Gly Ser Thr Gin Tyr lie Arg Phe Ser Pro Asp Phe 
180 185 190 

Thr Phe Gly Phe Glu Glu Ser Leu Glu Val Asp Thr Asn Pro Leu Leu 
195 200 205 

Gly Ala Gly Lys Phe Ala Thr Asp Pro Ala Val Thr Leu Ala His Glu 
210 215 220 

Leu lie His Ala Gly His Arg Leu Tyr Gly lie Ala He Asn Pro Asn 
225 230 235 240 

Arg Val Phe Lys Val Asn Thr Asn Ala Tyr Tyr Glu Met Ser Gly Leu 
245 250 255 

Glu Val Ser Phe Glu Glu Leu Arg Thr Phe Gly Gly His Asp Ala Lys 
260 265 270 

Phe He Asp Ser Leu Gin Glu Asn Glu Phe Arg Leu Tyr Tyr Tyr Asn 
275 280 285 

Lys Phe Lys Asp lie Ala Ser Thr Leu Asn Lys Ala Lys Ser He Val 
290 295 300 

Gly Thr Thr Ala Ser Leu Gin Tyr Met Lys Asn Val Phe Lys Glu Lys 
305 310 315 320 

Tyr Leu Leu Ser Glu Asp Thr Ser Gly Lys Phe Ser Val Asp Lvs Leu 
325 330 335 

Lys Phe Asp Lys Leu Tyr Lys Met Leu Thr Glu lie Tyr Thr Glu Asp 
3.40 345 350 

Asn Phe Val Lys Phe Phe Lys Val Leu Asn Arg Lys Thr Tyr Leu Asn 
355 360 365 

Phe Asp Lys Ala Val Phe Lys lie Asn lie Val Pro Lys Val Asn Tyr 
370 375 380 

Thr lie Tyr Asp Gly Phe Asn Leu Arg Asn Thr Asn Leu Ala Ala Asn 
385 390 395 400 

Phe Asn Gly Gin Asn Thr Glu lie Asn Asn Met Asn Phe Thr Lys Leu 
405 410 415 

Lys Asn Phe Thr Gly Leu Phe Glu Phe Tyr Lys Leu Leu Cys Val Arg 
420 425 430 

-^^y-^l^—I-^e^hr— Se r— Lys—T-hr—Lys— S e r— Leu-Asp— Lys^iy^T-yr—Asn—Lys-- 
435 440 445 

Ala Leu Asn Asp Leu Cys lie Lys Val Asn Asn Trp Asp Leu Phe Phe 
450 455 460 
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Ser Pro Ser Glu Asp Asn Phe Thr Asn Asp Leu Asn Lys Gly Glu Glu 
465 470 475 480 

lie Thr Ser Asp Thr Asn lie Glu Ala Ala Glu Glu Asn lie Ser Leu 
485 490 495 

Asp Leu lie Gin Gin Tyr Tyr Leu Thr Phe Asn Phe Asp Asn Glu Pro 
500 505 510 

Glu Asn lie Ser lie Glu Asn Leu Ser Ser Asp lie lie Gly Gin Leu 
515 520 525 

Glu Leu Met Pro Asn lie Glu Arg Phe Pro Asn Gly Lys Lys Tvr Glu 
530 535 540 

Leu Asp Lys Tyr Thr Met Phe His Tyr Leu Arg Ala Gin Glu Phe Glu 
545 550 555 560 

His Gly Lys Ser Arg lie Ala Leu Thr Asn Ser Val Asn Glu Ala Leu 
565 570 575 

Leu Asn Pro Ser Arg Val Tyr Thr Phe Phe Ser Ser Asp Tyr Val Lys 
580 585 590 

Lys Val Asn Lys Ala Thr Glu Ala Ala Met Phe Leu Gly Trp Val Glu 
595 600 605 

Gin Leu Val Tyr Asp Phe Thr Asp Glu Thr Ser Glu Val Ser Thr Thr 
610 615 620 

Asp Lys lie Ala Asp lie Thr lie lie lie Pro Tyr He Gly Pro Ala 
625 630 635 640 

Leu Asn lie Gly Asn Met Leu Tyr Lys Asp Asp Phe Val Gly Ala Leu 
645 650 655 

He Phe Ser Gly Ala Val He Leu Leu Glu Phe He Pro Glu He Ala 
660 665 670 

He Pro Val Leu Gly Thr Phe Ala Leu Val Ser Tyr He Ala Asn Lys 
675 680 685 

Val Leu Thr Val Gin Thr He Asp Asn Ala Leu Ser Lys Arg Asn Glu 
690 695 700 

Lys Trp Asp Glu Val Tyr Lys Tyr He Val Thr Asn Trp Leu Ala Lys 
705 710 715 720 

Val Asn Thr Gin He Asp Leu He Arg Lys Lys Met Lys Glu Ala Leu 
725 730 735 

Glu Asn Gin Ala Glu Ala Thr Lys Ala He He Asn Tyr Gin Tyr Asn 
740 745 750 

Gin Tyr Thr Glu Glu Glu Lys Asn Asn He Asn Phe Asn lie Asp Asp 
755 760 765 

Leu Ser Ser Lys Leu Asn Glu Ser He Asn Lys Ala Met He Asn He 
770 775 780 

Asn Lys Phe Leu Asn Gin Cys Ser Val Ser Tyr Leu Met Asn Ser Met 
785 790 795 800 

He Pro Tyr Gly Val Lys Arg Leu Glu Asp Phe Asp Ala Ser Leu Lys 
80S 810 815 
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Asp Ala Leu Leu Lys Tyr He Tyr Asp Asn Arg Gly Thr Leu He Gly 
820 825 830 

Gin Val Asp Arg Leu Lys Asp Lys Val Asn Asn Thr Leu Ser Thr Asp 
835 840 845 

He Pro Phe Gin Leu Ser Lys Tyr Val Asp Asn Gin Arg Leu Leu Ser 
850 855 860 

Thr Phe Thr Glu Tyr He Lys * 
865 870 

(2) INFORMATION FOR SEQ ID NO : 27: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2574 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



Ui) SEQUENCE DESCRIPTION : SEQ ID NO: 27:- 

ATGCCGGTTA CCATCAACAA CTTCAACTAC AACGACCCGA TCGACAACAA CAACATCATC 60 

ATGATGGAAC CGCCGTTCGC ACGTGGTACC GGTCGTTACT ACAAGGCTTT CAAGATCACC 120 

GACCGTATCT GGATCATCCC GGAACGTTAC ACCTTCGGTT ACAAACCTGA GGACTTCAAC 180 

AAGAGTAGCG GGATTTTCAA TCGTGACGTC TGCGAGTACT ATGATCCAGA TTATCTGAAT 24 0 

ACCAACGATA AGAAGAACAT ATTCCTTCAG ACTATGATCA AGTTATTTAA TAGAATCAAA 3 00 

TCAAAACCAT TGGGTGAAAA GTTATTAGAG ATGATTATAA ATGGTATACC TTATCTTGGA 3 60 

1 * * iw * ^ - * « - + - * v_ * * j. i-u-k^ wAi-iMrtCrt x ikj Li aij i"(jTAA(j TUTTAATAAA 420 

TTAATCAGTA ATCCAGGAGA AGTGGAGCGA AAAAAAGGTA TTTTCGCAAA TTTAATAATA 4 80 

TTTGGACCTG GGCCAGTTTT AAATGAAAAT GAGACTATAG ATATAGGTAT ACAAAATCAT 54 0 

TTTGCATCAA GGGAAGGCTT CGGGGGTATA ATGCAAATGA AGTTTTGCCC AGAATATGTA 600 

AGCGTATTTA ATAATGTTCA AGAAAACAAA GGCGCAAGTA TATTTAATAG ACGTGGATAT 660 

TTTTCAGATC CAGCCTTGAT ATTAATGCAT GAACTTATAC ATGTTTTACA TGGATTATAT 72 0 

GGCATTAAAG TAGATGATTT ACCAATTGTA CCAAATGAAA AAAAATTTTT TATGCAATCT 7 80 

ACAGATGCTA TACAGGCAGA AGAACTATAT ACATTTGGAG GACAAGATCC CAGCATCATA 840 

ACTCCTTCTA CGGATAAAAG TATCTATGAT AAAGTTTTGC AAAATTTTAG AGGGATAGTT 900 

GATAGACTTA ACAAGGTTTT AGTTTGCATA TCAGATCCTA ACATTAATAT TAATATATAT 960 

AAAAATAAAT TTAAAGATAA ATATAAATTC GTTGAAGATT CTGAGGGAAA ATATAGTATA 1020 

GATGTAGAAA GTTTTGATAA ATTATATAAA AGCTTAATGT TTGGTTTTAC AGAAACTAAT 1080~ 

ATAGCAGAAA ATTATAAAAT AAAAACTAGA GCTTCTTATT TTAGTGATTC CTTACCACCA 114 0 

GTAAAAATAA AAAATTTATT AGATAATGAA ATCTATACTA TAGAGGAAGG GTTTAATATA 12 0 0 
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TCTGATAAAG 


ATATGGAAAA 


AGAATATAGA 


GGTCAGAATA 


AAGCTATAAA 


TAAACAAGCT 


12 60 


TATGAAGAAA 


TTAGCAAGGA 


GC ATTTGGCT 


GTATATAAGA 


TACAAATGTG 


TAAAAGTGTT 


1 "lOA 
X J ^ W 


AAAGCTCCAG 


GAATATGTAT 


TGATGTTGAT 


AATGAAGATT 


TGTTCTTTAT 


AGCTGATAAA 


13 80 


AATAGTTTTT 


CAGATGATTT 


ATCTAAAAAC 


GAAAGAATAG 


AATATAATAC 


ACAGAGTAAT 


144 0 


TATATAGAAA 


ATGACTTCCC 


TATAAATGAA 


TTAATTTTAG 


ATACTGATTT 


AATAAGTAAA 


150 0 


ATAGAATTAC 


CAAGTGAAAA 


TACAGAATCA 


CTTACTGATT 


TTAATGTAGA 


TGTTCCAGTA 

X VJ X X Vm^Iw X ** 


1560 


X r\ X u/vxn/\rt\» 


nA^> >— w O >— X X 


AAAAAAAATT 


TTTACAGATG 


AAAATAC CAT 


PTTTPAATAT 

V— X X X V-rtrt X X 


1 CTrt 
lOZU 




AClArATTTCC 


TCTAGATATA 


AGAGATATAA 


GTTTAACATP 


TTPATTTHAT 

X X N— ^\ XXX \Jr\ X 


i con 
x. □ a vj 


GATGCATTAT 


TATTTTCTAA 


CAAAGTTTAT 


TCATTTTTTT 


CTATGGATTA 


TATTAAAACT 


174 0 


GCTAATAAAG 


TGGTAGAAGC 


AGGATTATTT 


GCAGGTTGGG 


TGAAACAGAT 


AGTAAATGAT 


180 0 


TTTGTAATCG 

X X X w X fy<r\ X w VJ 


AAGCTAATAA 


AAGCAATACT 


ATGGATAAAA 


TTGCAGATAT 


ATCTCTAATT 


186 0 


HTTrPTTATA 




TTTAAATGTA 


GGAAATGAAA 


CAGCTAAAGG 


AAATTTTGAA 


192 0 


a ZiTP l'"""T > T'T w 1 '(^! 




AGfTAGTATT 


PTAPTAGAAT 


TTATAPPAGA 


^V\— XXX X nn X/-\ 


X juu 


rTTflTAnTTn 


VJ/VJ^.V« X X X X X 


ATTAGAATCA 


TATATTGACA 


ATAAAAATAA 


AATTATTAAA 


2 04 0 


M.^.>vrx X X r\ 


ATHPTTTAAP 


TAAAAGAAAT 


G AAAAAT GGA 


GTGATATGTA 


C G G ATT AA T A 


210 0 


o X HuLuL/irt X 




ACT TAA TACT 


CAATTTTATA 

Va/W X X X X *k X /*v 


PAATAAAAGA 


GGGAATGTAT 


2160 


AAbbL X X iHA 




APAAGPATTG 


GAAGAAATAA 


TAAAATAPAG 


ATATAATATA 


2 22 0 


T IV TTrTrt 7\ A A 
X M. X lui. unrtrt 


AAGAAAAHTf 


AAATATTAAC 

/WVini X^^r%V» 


ATCGATTTTA 

X X X X X *» 


ATGATATAAA 


TTCTAAACTT 

X X w X #V*\** >^ X X 


2280 


rtrt X \ JMVJVJVJ X /A 


TTAAPrAAGC 


TATAGATAAT 


ATAAATAATT 


TTATAAATGG 


ATGTTCTGTA 


2340 


TCATATTTAA 


TGAAAAAAAT 


GATTCCATTA 


GCTGTAGAAA 


AATTACTAGA 


CTTTGATAAT 


2400 


ACTCTCAAAA 


AAAATTTGTT 


AAATTATATA 


GATGAAAATA 


AATTATATTT 


GATTGGAAGT 


2460 


GCAGAATATG 


AAAAATCAAA 


AGTAAATAAA 


TACTTGAAAA 


CCATTATGCC 


GTTTGATCTT 


2520 


TCAATATATA 


CCAATGATAC 


AATACTAATA 


GAAATGTTTA 


ATAAATATAA 


TAGC 


2574 



(2) INFORMATION FOR SEQ ID NO: 28: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2574 base pairs 

(B) TYPE: nucleic acid 

(C) STRAND EDNESS : double 

(D) TOPOLOGY : linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 28: 

ATGCCAGTTA CAATAAATAA TTTTAATTAT AATGATCCTA TTGATAATAA TAATATTATT 60 

ATGATGGAGC CTCCATTTGC GAGAGGTACG GGGAGATATT AT AAAG CTTT TAAAATCACA 12 0 

GATCGTATTT GGATAATACC GGAAAGATAT ACTTTTGGAT ATAAACCTGA GGATTTTAAT 180 

AAAAGTTCCG GTATTTTTAA TAGAGATGTT TGTGAATATT ATGATCCAGA TTACTTAAAT 240 
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ACTAATGATA AAAAGAATAT ATTTTTACAA ACAATGATCA AGTTATTTAA TAGAATCAAA 3 00 

TCAAAACCAT TGGGTGAAAA GTTATTAGAG ATGATTATAA ATGGTATACC TTATCTTGGA 3 60 

GATAGACGTG TTCCACTCGA AGAGTTTAAC ACAAACATTG CTAGTGTAAC TGTTAATAAA 42 0 

TTAATCAGTA ATCCAGGAGA AGTGGAGCGA AAAAAAGGTA TTTTCGCAAA TTTAATAATA 4 80 

TTTGGACCTG GGCCAGTTTT AAATGAAAAT GAGACTATAG ATATAGGTAT ACAAAATCAT 54 0 

TTTGCATCAA GGGAAGGCTT CGGGGGTATA ATGCAAATGA AGTTTTGCCC AGAATATGTA 6 00 

AGCGTATTTA ATAATGTTCA AGAAAACAAA GGCGCAAGTA TATTTAATAG ACGTGGATAT 660 

TTTTCAGATC CAGCCTTGAT ATTAATGCAT GAACTCATCC ACGTCCTCCA CGGTCTCTAC 720 

GGTATCAAAG TAGACGACCT CCCGATCGTC CCGAACGAAA AAAAATTCTT CATGCAGAGC 78 0 

ACCGACGCAA TCCAGGCAGA AGAACTCTAC ACCTTCGGTG GTCAGGACCC GAGCATCATC 84 0 

ACCCCGAGCA CCGACAAAAG CATCTACGAC AAAGTCCTCC AGAACTTCCG TGGTATCGTC 90 0 

GACCGTCTCA ACAAAGTCCT CGTCTGCATC AGCGACCCGA ACATCAACAT CAACATCTAC 960 

AAAAACAAAT TCAAAGACAA ATACAAATTC GTCGAAGACA GCGAAGGTAA ATACAGCATC 1020 

GACGTCGAGA GCTTCGACAA ACTCTACAAA AGCCTCATGT TCGGTTTCAC CGAAACCAAC 1080 

ATCGCAGAAA ACTACAAAAT CAAAACCCGT GCAAGCTACT TCAGCGACAG CCTCCCGCCG 1140 

GTCAAAATCA AAAACCTCCT CGACAACGAA ATCTACACCA TCGAAGAAGG TTTCAACATC 1200 

AGCGACAAAG ACATGGAAAA AGAATACCGT GGTCAGAACA AAGCAATCAA CAAACAAGCT 1260 

TACGAAGAAA TCAGCAAAGA ACACCTCGCA GTCTACAAAA TCCAGATGTG CAAAAGCGTC 13 20 

AAAGCACCGG GTATCTGCAT CGACGTTGAC AACGAAGACC TCTTCTTCAT CGCAGACAAA 13 80 

AACAGCTTCA GCGACGACCT CAGCAAAAAC GAACGTATCG AATACAACAC CCAGAGCAAC 144 0 

TACATCGAAA ACGACTTCCC GATCAACGAA CTCATCCTCG ACACCGACCT CATCAGCAAA 150 0 

ATCGAACTCC CGAGCGAAAA CACCGAAAGC CTCACCGACT TCAACGTTGA CGTCCCGGTC 1560 

TACGAAAAAC AGCCGGCAAT CAAAAAAATC TTCACCGACG AAAACACCAT CTTCCAGTAC 162 0 

CTCTACAGCC AGACCTTCCC GCTAGATATA AGAGATATAA GTTTAACATC TTCATTTGAT 1680 

GATGCATTAT TATTTTCTAA CAAAGTTTAT TCATTTTTTT CTATGGATTA TATTAAAACT 174 0 

GCTAATAAAG TGGTAGAAGC AGGATTATTT GCAGGTTGGG TGAAACAGAT AGTAAATGAT 1800 

TTTGTAATCG AAGCTAATAA AAGCAATACT ATGGATAAAA TTGCAGATAT ATCTCTAATT 186 0 

GTTCCTTATA TAGGATTAGC TTTAAATGTA GGAAATGAAA CAGCTAAAGG AAATTTTGAA 1920 

AATGCTTTTG AGATTGCAGG AGCCAGTATT CTACTAGAAT TTATACCAGA ACTTTTAATA 1980 

CCTGTAGTTG GAGCCTTTTT ATTAGAATCA TATATTGACA ATAAAAATAA AATTATTAAA 204 0 

CAATAG ATA ATGCTTTAAC TAAAAGAAAT GAAAAATGGA GTGATATGTA CGGATTAATA 210 

GTAGCGCAAT GGCTCTCAAC AGTTAATACT CAATTTTATA CAATAAAAGA GGGAATGTAT 2160 

AAGGCTTTAA ATTATCAAGC ACAAGCATTG GAAGAAATAA TAAAATACAG ATATAATATA 222 0 

TATTCTGAAA AAGAAAAGTC AAATATTAAC ATCGATTTTA ATGATATAAA TTCTAAACTT 22 80 
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TCATATTTAA TGAAAAAAAT GAIT CC ATT A 
ACTCTCAAAA AAAATTTGTT AAATTATATA 
GCAGAATATG AAAAATCAAA AGTAAATAAA 
TCAATATATA . CCAATG ATAC AATACTAATA 
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ATAAATAATT TTATAAATGG ATGTTCTGTA 23 4 

G CTG TAG AAA AATTACTAGA CTTTGATAAT 24 0 

GATGAAAATA AATTATATTT GATTGGAAGT 24 6 

TACTTGAAAA CCATTATGCC GTTTGATCTT 2 52 

GAAATGTTTA ATAAATATAA TAGC 2 57< 
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CLAIMS 

1. A polypeptide comprising first and second domains, wherein said first 
domain is adapted to cleave one or more vesicle or plasma-membrane associated 
proteins essential to exocytosis, and wherein said second domain is adapted (i) to 
translocate the polypeptide into a cell or (ii) to increase the solubility of the 
polypeptide compared to the solubility of the first domain on its own or (iii) both 
to translocate the polypeptide into a cell and to increase the solubility of the 
polypeptide compared to the solubility of the first domain on its own, said 
polypeptide being free of clostridial neurotoxin and free of clostridial neurotoxin 
precursor that can be converted into toxin by proteolytic action. 

2. A polypeptide according to Claim 1 wherein said first domain comprises a 
clostridial toxin light chain. 

3. A polypeptide according to Claim 1 wherein said first domain comprises a 
fragment or variant of a clostridial toxin light chain. 

4. A polypeptide according to Claim 2 or 3 wherein the clostridial toxin is a 
botulinum toxin. 

5. A polypeptide according to any preceding claim wherein the first domain 
exhibits endopeptidase activity specific for a substrate selected from one or more 
of SNAP-25, synaptobrevin/VAMP and syntaxin. 

6. A polypeptide according to any preceding claim wherein said second domain 
comprises a clostridial toxin heavy chain H N portion. 

7. A polypeptide according to any of Claims 1-5 wherein said second domain 
comprises a fragment or variant of a clostridial toxin heavy chain H N portion. 

8. A polypeptide according to Claim 6 or 7 wherein the clostridial toxin is a 
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9. A polypeptide according to any of Claims 1-8 further comprising a third 
domain adapted for binding of the polypeptide to a cell, by binding of the third 
domain directly to a cell or by binding of the third domain to a ligand or to ligands 
that bind to a cell. 

10. A polypeptide according to Claim 9 wherein said third domain is for binding 
the polypeptide to an immunoglobulin. 

1 1 . A polypeptide according to Claim 10 wherein said third domain is a tandem 
repeat synthetic IgG binding domain derived from domain 0 of Staphylococcal 
protein A. 

12. A polypeptide according to Claim 9 wherein said third domain comprises an 
amino acid sequence that binds to a cell surface receptor. 

13. A polypeptide according to Claim 1 2 wherein said third domain is insulin-like 
growth factor-1 (IGF-1). 

14. A polypeptide according to any preceding claim comprising a botulinum toxin 
light chain or a fragment or a variant of a botulinum toxin light chain and a portion 
designated H N of a botulinum toxin heavy chain. 

15. A polypeptide according to Claim 14 wherein one or both of (a) the toxin 
light chain or fragment or variant of toxin light chain and (b) the portion of the toxin 
heavy chain are of botulinum toxin type A. 

16. A polypeptide according to Claim 15 wherein the botulinum toxin type A 
light chain variant has at residue 2 a glutamate, at residue 26 a lysine and at 
residue 27 a tyrosine. 
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17. A polypeptide according to Claim 14 wherein one or both of (a) the toxin 
light chain or fragment or variant of toxin light chain and (b) the portion of the toxin 
heavy chain are of botulinum toxin type B. 

1 8. A polypeptide according to any of Claims 1-13 comprising a botulinum toxin 
light chain or a fragment or a variant of a botulinum toxin light chain and at least 
100 N-terminal amino acids of a botulinum toxin heavy chain. 

19. A polypeptide according to Claim 18 comprising a botulinum toxin type B 
light chain, or a fragment or variant thereof, and 107 N-terminal amino acids of a 
botulinum toxin type B heavy chain. 

20. A polypeptide according to Claim 1 5 or 1 6 comprising at least 423 of the N- 
terminal amino acids of botulinum toxin type A heavy chain. 

21. A polypeptide according to Claim 20 comprising a botulinum toxin type A 
light chain and 423 N-terminal amino acids of a botulinum toxin type A heavy 
chain. 

22. A polypeptide according to Claim 20 comprising a botulinum toxin type A 
light chain variant wherein residue 2 is a glutamate, residue 26 is a lysine and 
residue 27 is a tyrosine, and 423 N-terminal amino acids of a botulinum toxin type 
A heavy chain. 

23. A polypeptide according to Claim 17 comprising at least 417 of the N- 
terminal amino acids of botulinum toxin type B heavy chain. 

24. A polypeptide according to Claim 23 comprising a botulinum toxin type B 
light chain and 417 N-terminal amino acids of a botulinum toxin type B heavy 
cRaifT 



25. 



A polypeptide according to any of Claims 1 4-24 lacking a portion designated 
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H c of a botulinum toxin heavy chain. 
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26. A polypeptide comprising a botulinum toxin light chain and a fragment of a 
botulinum toxin heavy chain, said fragment being not capable of binding to cell 
surface receptors. 

27. A polypeptide according to Claim 26 lacking an intact portion designated H c 
of a botulinum toxin heavy chain. 

28. A polypeptide according to any preceding claim comprising a variant of a 
clostridial toxin and further comprising a site for cleavage by a proteolytic enzyme, 
which cleavage site is not present in the native toxin. 

29. A polypeptide according to Claim 28 comprising a variant of a clostridial 
toxin light chain and further comprising a site for cleavage by a proteolytic enzyme, 
which cleavage site is not present in the native toxin light chain. 

30. A polypeptide according to Claim 28 or 29 comprising a variant of a 
clostridial toxin heavy chain H N portion and further comprising a site for cleavage 
by a proteolytic enzyme, which cleavage site is not present in the native toxin 
heavy chain H N portion. 

31 . A polypeptide according to Claim 28, 29 or 30 obtainable by modification 
of a DNA encoding the polypeptide so as to introduce one or more nucleotides 
coding for the cleavage site. 

32. A fusion protein comprising a fusion of (a) a polypeptide according to any 
of Claims 1-31 with (b) a second polypeptide being a polypeptide or oligopeptide 
adapted for binding to an affinity matrix so as to enable purification of the fusion 
protein using said matrix. 



33. A fusion protein according to Claim 32 wherein said second polypeptide is 
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adapted to bind to a chromatography column, such as an affinity matrix of 
glutathione Sepharose. 

34. A fusion protein according to Claim 32 or 33 wherein a specific protease 
cleavage site is incorporated between the first and second polypeptides, said 
protease site enabling proteolytic separation of first and second polypeptides. 

35. A composition comprising a derivative of a clostridial toxin, said derivative 
retaining at least 10% of the endopeptidase activity of the botulinum toxin, said 

derivative further being non-toxic in vivo due to its inability to bind to cell surface 
receptors, and wherein the composition is free of any component, such as toxin or 
a further toxin derivative, that is toxic in vivo. 

36. A composition according to Claim 35 or a polypeptide according to any of 
Claims 1-31 or a fusion protein according to Claim 32, 33 or 34 for use as a 
positive control in a toxin assay. 

37. A composition according to Claim 35 or a polypeptide according to any of 
Claims 1-31 or a fusion protein according to Claim 32, 33 or 34 for use as a 
vaccine against clostridial toxin. 

38. A composition according to Claim 35 or a polypeptide according to any of 
Claims 1-31 or a fusion protein according to Claim 32, 33 or 34 for in vivo use. 

39. A pharmaceutical composition comprising a composition according to Claim 
35, a polypeptide according to any of claims 1-31 or a fusion protein according to 
Claim 32, 33 or 34, in combination with a pharmaceutical^ acceptable carrier. 

40. A nucleic acid encoding a polypeptide or a fusion protein according to any 
of Claims 1-34. ~ 

41 . A nucleic acid encoding a polypeptide or a fusion protein according to Claim 
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40 and comprising nucleotides encoding residues 1-448 of a botulinum toxin type 
A light chain. 

42. A nucleic acid according to Claim 40 or 41 comprising nucleotides encoding 
residues 1-423 of a botulinum toxin type A heavy chain H N domain. 

43. A nucleic acid encoding a polypeptide or a fusion protein according to Claim 
40 and comprising nucleotides encoding residues 1-470 of a botulinum toxin type 
B light chain. 

44. A nucleic acid encoding a polypeptide or a fusion protein according to Claim 
40 or 43 comprising nucleotides encoding residues 1 -41 7 of a botulinum toxin type 
B heavy chain H N domain. 

45. A nucleic acid according to any of Claims 40-44 comprising nucleotides 
encoding a restriction endonuclease cleavage site not present in native clostridial 
toxin sequence. 

46. A nucleotide according to Claim 45 obtainable by modification of a 
nucleotide encoding a polypeptide or fusion protein according to any of claims 1 -34 
so as to introduce said cleavage site. 

47. A DNA according to any of claims 40-46. 

48. A DNA selected from SEQ ID No:s 1 , 8, 1 0, 1 2, 1 4, 1 6, 1 8, 23 and 24. 

49. A method of manufacture of a polypeptide according to any of Claims 1-31 
comprising expressing in a host cell a nucleic acid according to any of Claims 40- 
48 and recovering the polypeptide. 

50. A method of manufacture of a polypeptide according to any of Claims 1 -31 
comprising expressing in a host cell a nucleic acid encoding a fusion protein 
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according to Claim 32, 33 or 34, purifying the fusion protein by eluting the fusion 
protein through an affinity matrix adapted to retain the fusion protein and eluting 
through said matrix a ligand adapted to displace the fusion protein, and recovering 
the fusion protein. 

51 . A method of manufacture according to Claims 49 or 50 in which the nucleic 
acid is DNA. 



52. A cell expressing a polypeptide or fusion protein according to any of Claims 
1-34. 
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